Bug #17029
closedURI.parse considers https://example.com/### invalid when browsers consider it valid
Description
I have a form with <input type="url" required>
and in the backend, I try to extract the domain with URI.parse(url).host
A user was able to submit a value like https://example.com/###
which passed the browser's validation check, but failed by URI.parse
with this error:
3: from /Users/helix/.rbenv/versions/2.7.1/lib/ruby/2.7.0/uri/common.rb:234:in `parse'
2: from /Users/helix/.rbenv/versions/2.7.1/lib/ruby/2.7.0/uri/rfc3986_parser.rb:73:in `parse'
1: from /Users/helix/.rbenv/versions/2.7.1/lib/ruby/2.7.0/uri/rfc3986_parser.rb:67:in `split'
URI::InvalidURIError (bad URI(is not URI?): "https://example.com/###")
You can try the browser's behavior at MDN's demo: https://developer.mozilla.org/en-US/docs/Web/HTML/Element/input/url
This is what the MDN page says about validation:
The syntax of a URL is fairly intricate. It's defined by WHATWG's URL Living Standard ( https://url.spec.whatwg.org/ ) and is described for newcomers in our article What is a URL? ( https://developer.mozilla.org/en-US/docs/Learn/Common_questions/What_is_a_URL )
Updated by jeremyevans0 (Jeremy Evans) over 4 years ago
This does seem like a bug to me. It looks like https://example.com/###
should be an valid URL with a fragment of ##
. However, the uri library is maintained in a separate repository. Please submit this as an issue to https://github.com/ruby/uri/issues/new.
Updated by phluid61 (Matthew Kerwin) over 4 years ago
It's not valid according to RFC 3986 (the URI standard) but that is pretty old now. I suspect switching from the IETF URI spec to the WHATWG URL spec would have other consequences, too.
Updated by nileshtr (Nilesh Trivedi) over 4 years ago
I filed an issue at the uri library's Github repo: https://github.com/ruby/uri/issues/8
Updated by jeremyevans0 (Jeremy Evans) over 4 years ago
- Status changed from Open to Closed