Bug #9974
closed
Regression: URI.parse allows invalid URIs
Added by ggilder (Gabriel Gilder) over 10 years ago.
Updated almost 8 years ago.
Description
$ ruby -v
ruby 2.2.0dev (2014-06-23 trunk 46517) [x86_64-darwin13]
$ ruby -ruri -e "puts URI.parse('http://test_example')"
http://test_example
Compare to Ruby 2.1.2:
$ ruby -v
ruby 2.1.2p95 (2014-05-08 revision 45877) [x86_64-darwin13.0]
$ ruby -ruri -e "puts URI.parse('http://test_example')"
/Users/gabriel/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/uri/generic.rb:214:in `initialize': the scheme http does not accept registry part: test_example (or bad hostname?) (URI::InvalidURIError)
from /Users/gabriel/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/uri/http.rb:84:in `initialize'
from /Users/gabriel/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/uri/common.rb:214:in `new'
from /Users/gabriel/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/uri/common.rb:214:in `parse'
from /Users/gabriel/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/uri/common.rb:747:in `parse'
from -e:1:in `<main>'
This appears to be a regression as hostnames cannot legally contain underscores: http://en.wikipedia.org/wiki/Hostname#Restrictions%5Fon%5Fvalid%5Fhost%5Fnames
Oops, sorry for bad code formatting. In any case you should see that in Ruby 2.1.2 URI::InvalidURIError is raised.
- Description updated (diff)
- Priority changed from 5 to Normal
Thank you for checking trunk version,
You may know, RFC3986 allows underscores in reg-name though DNS name doesn't include underscore.
host = IP-literal / IPv4address / reg-name
reg-name = *( unreserved / pct-encoded / sub-delims )
pct-encoded = "%" HEXDIG HEXDIG
unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"
reserved = gen-delims / sub-delims
gen-delims = ":" / "/" / "?" / "#" / "[" / "]" / "@"
sub-delims = "!" / "$" / "&" / "'" / "(" / ")"
/ "*" / "+" / "," / ";" / "="
- Related to Bug #8241: If uri host-part has underscore ( '_' ), 'URI#parse' raise 'URI::InvalidURIError' added
The URI abstraction speaks to RFC3986 (DNS) more directly than RFC952 (hostnames). The confusion is understandable.
Still, standards-based systems exist right now that allow this (e.g. we have an nginx deployed application that chose an underscore 4th-level name, but it works every place else).
But any Ruby clients that attempt to integrate against this endpoint fail. This puts Ruby at a disadvantage. I'd argue that the existence of Python/Java solutions etc. may not be 'legal', but is at least 'de facto'.
- Status changed from Open to Rejected
Larry Kyrala wrote:
The URI abstraction speaks to RFC3986 (DNS) more directly than RFC952 (hostnames). The confusion is understandable.
Still, standards-based systems exist right now that allow this (e.g. we have an nginx deployed application that chose an underscore 4th-level name, but it works every place else).
But any Ruby clients that attempt to integrate against this endpoint fail. This puts Ruby at a disadvantage. I'd argue that the existence of Python/Java solutions etc. may not be 'legal', but is at least 'de facto'.
JavaScript's new URL("http://test_example")
accepts underscore.
Therefore I don't think error is de facto.
Also available in: Atom
PDF
Like0
Like0Like0Like0Like0Like0Like0