https://redmine.ruby-lang.org/https://redmine.ruby-lang.org/favicon.ico?17113305112013-05-02T02:34:09ZRuby Issue Tracking SystemRuby master - Bug #8352: URI squeezes a sequence of slashes in merging paths when it shouldn'thttps://redmine.ruby-lang.org/issues/8352?journal_id=390772013-05-02T02:34:09Zknu (Akinori MUSHA)knu@ruby-lang.org
<ul></ul><p>s/RFC 2896/RFC 2396/</p> Ruby master - Bug #8352: URI squeezes a sequence of slashes in merging paths when it shouldn'thttps://redmine.ruby-lang.org/issues/8352?journal_id=497522014-10-31T10:05:00Znaruse (Yui NARUSE)naruse@airemix.jp
<ul><li><strong>Description</strong> updated (<a title="View differences" href="/journals/49752/diff?detail_id=35907">diff</a>)</li></ul> Ruby master - Bug #8352: URI squeezes a sequence of slashes in merging paths when it shouldn'thttps://redmine.ruby-lang.org/issues/8352?journal_id=678212017-11-15T11:03:49Zknu (Akinori MUSHA)knu@ruby-lang.org
<ul><li><strong>Subject</strong> changed from <i>uri squeezes a sequence of slashes in merging paths when it shouldn't</i> to <i>URI squeezes a sequence of slashes in merging paths when it shouldn't</i></li><li><strong>Description</strong> updated (<a title="View differences" href="/journals/67821/diff?detail_id=46815">diff</a>)</li><li><strong>Backport</strong> deleted (<del><i>1.9.3: UNKNOWN, 2.0.0: UNKNOWN</i></del>)</li></ul> Ruby master - Bug #8352: URI squeezes a sequence of slashes in merging paths when it shouldn'thttps://redmine.ruby-lang.org/issues/8352?journal_id=678222017-11-15T11:24:32Zknu (Akinori MUSHA)knu@ruby-lang.org
<ul></ul><p>Addressable::URI (of the addressable gem) properly preserves sequences of slashes in a path, so it is a workaround to use it instead.</p>
<p>I've confirmed that <code>net/url</code> of Go, <code>URI</code> of Perl, <code>urlparse.urljoin</code> of Python2 or <code>java.net.URL</code> of Java never does this kind of unwanted normalization.</p>
<p>A single exception I could find, however, was <code>urllib.parse</code> of Python3. (!)</p>
<pre><code>% python3
Python 3.6.3 (default, Nov 4 2017, 01:15:26)
[GCC 4.2.1 Compatible FreeBSD Clang 3.8.0 (tags/RELEASE_380/final 262564)] on freebsd11
Type "help", "copyright", "credits" or "license" for more information.
>>> from urllib.parse import urljoin
>>> urljoin('http://example.com/foo//bar/baz', '.')
'http://example.com/foo/bar/'
</code></pre>
<p>I'm not sure if this is an intentional change from Python2, but I believe any slash in the path part should be retained.</p> Ruby master - Bug #8352: URI squeezes a sequence of slashes in merging paths when it shouldn'thttps://redmine.ruby-lang.org/issues/8352?journal_id=678232017-11-15T12:35:28Zknu (Akinori MUSHA)knu@ruby-lang.org
<ul></ul><p>I've also checked the <code>url</code> module of node.js and it didn't, neither. <a href="https://github.com/nodejs/node/blob/78545039d65fa24841454f161c3711ce4b5226bc/test/parallel/test-url-relative.js" class="external">Their test cases</a> do not include explicit examples of how to deal with sequences of slashes in a path, but there are some occurrences of double-slash retained in the expected results of relative path resolution, which means double-slash is not a subject of squeezing.</p>
<p>Looking into <a href="https://url.spec.whatwg.org/" class="external">WHATWG URL spec</a>, there's no indication that a sequence of slashes in a URL path should be treated specially. A path is simply a "list" of "items" separated with the slash (/, U+002F) and any item can naturally be an empty string. Even when resolving a "double-dot segment" and consequently "removing" a path "item" you are never told to "remove" extra items that are empty.</p>
<p>So, as you can see, Ruby and Python3 are the only exceptions, there's no specification that indicates that a sequence of slashes in a URL path should be treated specially, and the majority of library implementations found in other languages supports that. I presume there are few programmers who would rely on the current behavior.</p> Ruby master - Bug #8352: URI squeezes a sequence of slashes in merging paths when it shouldn'thttps://redmine.ruby-lang.org/issues/8352?journal_id=678892017-11-22T07:41:55Zduerst (Martin Dürst)duerst@it.aoyama.ac.jp
<ul></ul><p>knu (Akinori MUSHA) wrote:</p>
<blockquote>
<p>I presume there are few programmers who would rely on the current behavior.</p>
</blockquote>
<p>I agree that there should be few programmers who would rely on subsequent slashes to be collapsed to a single slash. However, I also think it's a bad idea for programmers or users to rely on multiple consecutive slashes to be preserved. Using multiple consecutive slashes in an URI is a bad idea.</p> Ruby master - Bug #8352: URI squeezes a sequence of slashes in merging paths when it shouldn'thttps://redmine.ruby-lang.org/issues/8352?journal_id=678902017-11-22T08:37:28Zphluid61 (Matthew Kerwin)matthew@kerwin.net.au
<ul></ul><p>duerst (Martin Dürst) wrote:</p>
<blockquote>
<p>Using multiple consecutive slashes in an URI is a bad idea.</p>
</blockquote>
<p>It definitely doesn't play nicely with dot-segment resolution, but then I wouldn't want to bear the burden of deciding how to resolve that, one way or the other.</p>
<p>In this particular case, I think it is <em>incorrect</em> to automatically remove empty segments, but I also think it's bad to have them in the first place.</p>
<p>What if there was a way for the programmer to explicitly invoke the current behaviour (e.g. by sending a different message), so the side-effect is expected?</p> Ruby master - Bug #8352: URI squeezes a sequence of slashes in merging paths when it shouldn'thttps://redmine.ruby-lang.org/issues/8352?journal_id=682882017-12-12T05:46:57Zknu (Akinori MUSHA)knu@ruby-lang.org
<ul><li><strong>File</strong> <a href="/attachments/6864">0001-Allow-empty-path-components-in-a-URI-Bug-8352.patch</a> <a class="icon-only icon-download" title="Download" href="/attachments/download/6864/0001-Allow-empty-path-components-in-a-URI-Bug-8352.patch">0001-Allow-empty-path-components-in-a-URI-Bug-8352.patch</a> added</li><li><strong>Assignee</strong> changed from <i>akira (akira yamada)</i> to <i>naruse (Yui NARUSE)</i></li></ul><p>Naruse-san, could you review the attached patch?</p> Ruby master - Bug #8352: URI squeezes a sequence of slashes in merging paths when it shouldn'thttps://redmine.ruby-lang.org/issues/8352?journal_id=682972017-12-12T08:08:28Zknu (Akinori MUSHA)knu@ruby-lang.org
<ul><li><strong>Target version</strong> set to <i>2.5</i></li></ul> Ruby master - Bug #8352: URI squeezes a sequence of slashes in merging paths when it shouldn'thttps://redmine.ruby-lang.org/issues/8352?journal_id=683722017-12-14T01:11:40Zknu (Akinori MUSHA)knu@ruby-lang.org
<ul><li><strong>Status</strong> changed from <i>Open</i> to <i>Closed</i></li></ul><p>Applied in changeset trunk|r61218.</p>
<hr>
<p>Allow empty path components in a URI [Bug <a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: URI squeezes a sequence of slashes in merging paths when it shouldn't (Closed)" href="https://redmine.ruby-lang.org/issues/8352">#8352</a>]</p>
<ul>
<li>generic.rb (URI::Generic#merge, URI::Generic#route_to): Fix a bug<br>
where a sequence of slashes in the path part gets collapsed to a<br>
single slash. According to the relevant RFCs and WHATWG URL<br>
Standard, empty path components are simply valid and there is no<br>
special treatment defined for them, so we just keep them as they<br>
are.</li>
</ul> Ruby master - Bug #8352: URI squeezes a sequence of slashes in merging paths when it shouldn'thttps://redmine.ruby-lang.org/issues/8352?journal_id=788832019-06-26T04:08:20Zjeremyevans0 (Jeremy Evans)merch-redmine@jeremyevans.net
<ul><li><strong>Has duplicate</strong> <i><a class="issue tracker-1 status-5 priority-4 priority-default closed" href="/issues/12562">Bug #12562</a>: URI merge removes empty segment contrary to RFC 3986</i> added</li></ul>