Handling of [ and ] in URIs
This is a little bit related to the old Tracker entry http://rubyforge.org/tracker/?group_id=426&atid=1698&func=detail&aid=1952
The problem surfaced for me again just today, while working with WWW::Mechanize parsing a private website (sorry, no public test case) using some Drupal image gallery software. It produces URIs like http://localhost/images/file.jpg without escaping the [ and ].
I know this is against the RFC (RFC 3986 states in section 3.2.2. Host: "This is the only place where square bracket characters are allowed in the URI syntax."), but I would be glad if Ruby was a little bit more liberal (in the old spirit of "Be liberal with what you accept and strict with what you produce") and accept these malformed URIs. Bonus points if it also corrected them...
I had posted a "fix" to the old tracker item, but while it is a working workaround, it doesn't look like a valid solution to me anymore.
This bug also applies to Ruby 1.9.
Updated by akira (akira yamada) over 10 years ago
- Status changed from Open to Rejected
"[" and "]" can not be in URI::PATTERN::UNRESERVED because the URI module is used as a syntax checker of URI like strings.
It is basically same as above. But you can use your URI parser. Example:
parser = URI::Parser.new(:UNRESERVED=>URI::PATTERN::UNRESERVED+'')
uri = parser.parse("http://exmaple.jp/.jpg")
URI::InvalidURIError: bad URI(is not URI?): http://exmaple.jp/.jpg