https://redmine.ruby-lang.org/https://redmine.ruby-lang.org/favicon.ico?17113305112009-03-07T18:29:32ZRuby Issue Tracking SystemRuby master - Bug #1251: gsub problemhttps://redmine.ruby-lang.org/issues/1251?journal_id=34542009-03-07T18:29:32Zmatz (Yukihiro Matsumoto)matz@ruby.or.jp
<ul></ul><p>=begin<br>
HI,</p>
<p>In message "Re: <a href="/issues/1251">[ruby-core:22715]</a> [Bug <a class="issue tracker-1 status-6 priority-4 priority-default closed" title="Bug: gsub problem (Rejected)" href="https://redmine.ruby-lang.org/issues/1251">#1251</a>] gsub problem"<br>
on Sat, 7 Mar 2009 18:08:11 +0900, Alexander Pettelkau <a href="mailto:redmine@ruby-lang.org" class="email">redmine@ruby-lang.org</a> writes:</p>
<p>|I wanted to replace "" with "\" in the string "\TEST":<br>
|<br>
|s="\TEST"<br>
|puts s # Output --> "\TEST"<br>
|s.gsub!("\","\\")<br>
|puts s # Output --> "\TEST"<br>
| # but EXPECTED Output "\TEST"</p>
<p>You specified four backslashes in double quotes, which is two<br>
backslashes in a string. But replacement character does backslash<br>
escapement such as \1, and \ (two backslashes) are transformed into<br>
one backslash. That means you've substituted one backslash to one<br>
backslash.</p>
<p>To substitute one backslash into two, you have to do</p>
<p>s.gsub!("\","\\\")</p>
<p>or</p>
<p>s.gsub!(/\/){"\\"}</p>
<pre><code> matz.
</code></pre>
<p>=end</p> Ruby master - Bug #1251: gsub problemhttps://redmine.ruby-lang.org/issues/1251?journal_id=34552009-03-07T18:30:06ZWoNaDo (Wolfgang Nádasi-Donner)wonado@t-online.de
<ul></ul><p>=begin<br>
Alexander Pettelkau schrieb:</p>
<blockquote>
<p>Bug <a class="issue tracker-1 status-6 priority-4 priority-default closed" title="Bug: gsub problem (Rejected)" href="https://redmine.ruby-lang.org/issues/1251">#1251</a>: gsub problem<br>
<a href="http://redmine.ruby-lang.org/issues/show/1251" class="external">http://redmine.ruby-lang.org/issues/show/1251</a></p>
<p>Author: Alexander Pettelkau<br>
Status: Open, Priority: Normal<br>
Category: core, Target version: 1.9.1<br>
ruby -v: ruby 1.9.1p0 (2009-01-30 revision 21907) [i386-darwin9.6.0]</p>
<p>I wanted to replace "" with "\" in the string "\TEST":</p>
<p>s="\TEST"<br>
puts s # Output --> "\TEST"<br>
s.gsub!("\","\\")<br>
puts s # Output --> "\TEST"<br>
# but EXPECTED Output "\TEST"</p>
<hr>
<p><a href="http://redmine.ruby-lang.org" class="external">http://redmine.ruby-lang.org</a></p>
</blockquote>
<p>After the first step, the String contains two backslashes. This string<br>
will be interpreted again, because there can be references to matched<br>
groups inside (e.g. '\1'). This second interpretation sees a escaped<br>
backslash (backslash-backslash, which results in one backslash.</p>
<p>I think it should be documented,</p>
<p>Wolfgang Nádasi-Donner</p>
<p>=end</p> Ruby master - Bug #1251: gsub problemhttps://redmine.ruby-lang.org/issues/1251?journal_id=34572009-03-07T20:49:03Zmatz (Yukihiro Matsumoto)matz@ruby.or.jp
<ul><li><strong>Status</strong> changed from <i>Open</i> to <i>Rejected</i></li></ul><p>=begin</p>
<p>=end</p> Ruby master - Bug #1251: gsub problemhttps://redmine.ruby-lang.org/issues/1251?journal_id=34582009-03-07T21:02:27ZWoNaDo (Wolfgang Nádasi-Donner)wonado@t-online.de
<ul></ul><p>=begin<br>
Yukihiro Matsumoto schrieb:</p>
<blockquote>
<p>To substitute one backslash into two, you have to do</p>
<p>s.gsub!("\","\\\")<br>
...<br>
myprompt> irb191-p0<br>
irb(main):001:0> puts "a\b".gsub!("\","\\\")<br>
a\b<br>
=> nil<br>
irb(main):002:0> puts "a\b".gsub!("\","\\\\")<br>
a\b<br>
=> nil</p>
</blockquote>
<p>I was surprized by this result long ago, until I started to assume, that<br>
the second replacement works only for <...>, \nr, \, and leaves the<br>
backslash as it is in all other combinations (even at end of the string).</p>
<p>This ist different from the first replacement, which consumes always a<br>
backslash as escape character...</p>
<p>myprompt> irb191-p0<br>
irb(main):001:0> puts "\\w"<br>
\w<br>
=> nil</p>
<p>I think this behaviour should be documented somewhere, because it can<br>
really confuse persons, which do not use complex RegExes during their<br>
daily work.</p>
<p>Wolfgang Nádasi-Donner</p>
<p>=end</p> Ruby master - Bug #1251: gsub problemhttps://redmine.ruby-lang.org/issues/1251?journal_id=34602009-03-08T02:53:19Zpettel (Alexander Pettelkau)
<ul></ul><p>=begin<br>
Thanks a lot for clearing that up so fast !</p>
<p>Alexander Pettelkau<br>
=end</p> Ruby master - Bug #1251: gsub problemhttps://redmine.ruby-lang.org/issues/1251?journal_id=35372009-03-13T19:50:10Zmatz (Yukihiro Matsumoto)matz@ruby.or.jp
<ul></ul><p>=begin<br>
Hi,</p>
<p>In message "Re: <a href="https://blade.ruby-lang.org/ruby-core/22719">[ruby-core:22719]</a> Re: [Bug <a class="issue tracker-1 status-6 priority-4 priority-default closed" title="Bug: gsub problem (Rejected)" href="https://redmine.ruby-lang.org/issues/1251">#1251</a>] gsub problem"<br>
on Sat, 7 Mar 2009 21:00:34 +0900, Wolfgang Nádasi-Donner <a href="mailto:ed.odanow@wonado.de" class="email">ed.odanow@wonado.de</a> writes:</p>
<p>|I think this behaviour should be documented somewhere, because it can<br>
|really confuse persons, which do not use complex RegExes during their<br>
|daily work.</p>
<p>Agreed. Any opinion for concrete description? Anyone?</p>
<pre><code> matz.
</code></pre>
<p>=end</p> Ruby master - Bug #1251: gsub problemhttps://redmine.ruby-lang.org/issues/1251?journal_id=35412009-03-13T20:53:15ZWoNaDo (Wolfgang Nádasi-Donner)wonado@t-online.de
<ul></ul><p>=begin<br>
Yukihiro Matsumoto schrieb:</p>
<blockquote>
<p>In message "Re: <a href="https://blade.ruby-lang.org/ruby-core/22719">[ruby-core:22719]</a> Re: [Bug <a class="issue tracker-1 status-6 priority-4 priority-default closed" title="Bug: gsub problem (Rejected)" href="https://redmine.ruby-lang.org/issues/1251">#1251</a>] gsub problem"<br>
on Sat, 7 Mar 2009 21:00:34 +0900, Wolfgang Nádasi-Donner <a href="mailto:ed.odanow@wonado.de" class="email">ed.odanow@wonado.de</a> writes:<br>
|I think this behaviour should be documented somewhere, because it can<br>
|really confuse persons, which do not use complex RegExes during their<br>
|daily work.<br>
Agreed. Any opinion for concrete description? Anyone?<br>
The contents should describe the fact, that the second parsing of the<br>
replacement string will replace \ by , \n by the string found by<br>
anonymous group n or by empty string if the group doesn't exist and n is<br>
between 1 and 9, or <name> and 'name' by the named group.</p>
</blockquote>
<p>But don't use my english. It may lead to more confusion.</p>
<p>Wolfgang Nádasi-Donner</p>
<p>=end</p> Ruby master - Bug #1251: gsub problemhttps://redmine.ruby-lang.org/issues/1251?journal_id=35422009-03-13T22:48:01Zstepheneb (Stephen Bannasch)stephen.bannasch@deanbrook.org
<ul></ul><p>=begin<br>
This sequence helped me understand the issue better:</p>
<blockquote>
<blockquote>
<p>a = b = "1_2_3"<br>
=> "1_2_3"<br>
for i in 0..b.length do print "#{b[i]} " end<br>
49 95 50 95 51 => 0..5<br>
b = a.gsub('<em>', '\')<br>
=> "1\2\3"<br>
for i in 0..b.length do print "#{b[i]} " end<br>
49 92 50 92 51 => 0..5<br>
b = a.gsub('</em>', '\\')<br>
=> "1\2\3"<br>
for i in 0..b.length do print "#{b[i]} " end<br>
49 92 50 92 51 => 0..5<br>
b = a.gsub('_', '\\\')<br>
=> "1\\2\\3"<br>
for i in 0..b.length do print "#{b[i]} " end<br>
49 92 92 50 92 92 51 => 0..7</p>
</blockquote>
</blockquote>
<p>=end</p> Ruby master - Bug #1251: gsub problemhttps://redmine.ruby-lang.org/issues/1251?journal_id=35432009-03-14T01:34:34Zrue (Eero Saynatkari)
<ul></ul><p>=begin<br>
Excerpts from Yukihiro Matsumoto's message of Fri Mar 13 12:47:48 +0200 2009:</p>
<blockquote>
<p>Hi,</p>
<p>In message "Re: <a href="https://blade.ruby-lang.org/ruby-core/22719">[ruby-core:22719]</a> Re: [Bug <a class="issue tracker-1 status-6 priority-4 priority-default closed" title="Bug: gsub problem (Rejected)" href="https://redmine.ruby-lang.org/issues/1251">#1251</a>] gsub problem"<br>
on Sat, 7 Mar 2009 21:00:34 +0900, Wolfgang Ndasi-Donner <a href="mailto:ed.odanow@wonado.de" class="email">ed.odanow@wonado.de</a> writes:</p>
<p>|I think this behaviour should be documented somewhere, because it can<br>
|really confuse persons, which do not use complex RegExes during their<br>
|daily work.</p>
<p>Agreed. Any opinion for concrete description? Anyone?</p>
</blockquote>
<p>RubySpec has this to say (please add any clarifications and<br>
missing behaviour--I am sure there are some 1.9.1 cases at<br>
least):</p>
<p>ruby 1.8.7 (2008-08-11 patchlevel 72) [i686-darwin9]</p>
<p>String#sub with pattern, replacement</p>
<ul>
<li>returns a copy of self with all occurrences of pattern replaced with replacement</li>
<li>ignores a block if supplied</li>
<li>supports \G which matches at the beginning of the string</li>
<li>supports /i for ignoring case</li>
<li>doesn't interpret regexp metacharacters if pattern is a string</li>
<li>replaces \1 sequences with the regexp's corresponding capture</li>
<li>treats \1 sequences without corresponding captures as empty strings</li>
<li>replaces & and \0 with the complete match</li>
<li>replaces ` with everything before the current match</li>
<li>replaces ' with everything after the current match</li>
<li>replaces \+ with \+</li>
<li>replaces + with the last paren that actually matched</li>
<li>treats + as an empty string if there was no captures</li>
<li>maps \ in replacement to \</li>
<li>leaves unknown \x escapes in replacement untouched</li>
<li>leaves \ at the end of replacement untouched</li>
<li>taints the result if the original string or replacement is tainted</li>
<li>tries to convert pattern to a string using to_str</li>
<li>raises a TypeError when pattern can't be converted to a string</li>
<li>tries to convert replacement to a string using to_str</li>
<li>raises a TypeError when replacement can't be converted to a string</li>
<li>returns subclass instances when called on a subclass</li>
<li>sets $~ to MatchData of match and nil when there's none</li>
<li>replaces \1 with \1</li>
<li>replaces \1 with \1</li>
<li>replaces \\1 with \</li>
</ul>
<p>String#sub with pattern and block</p>
<ul>
<li>returns a copy of self with the first occurrences of pattern replaced with the block's return value</li>
<li>sets $~ for access from the block</li>
<li>restores $~ after leaving the block</li>
<li>sets $~ to MatchData of last match and nil when there's none for access from outside</li>
<li>doesn't raise a RuntimeError if the string is modified while substituting</li>
<li>doesn't interpolate special sequences like \1 for the block's return value</li>
<li>converts the block's return value to a string using to_s</li>
<li>taints the result if the original string or replacement is tainted</li>
</ul>
<p>String#sub! with pattern, replacement</p>
<ul>
<li>modifies self in place and returns self</li>
<li>taints self if replacement is tainted</li>
<li>returns nil if no modifications were made</li>
<li>raises a TypeError when self is frozen</li>
</ul>
<p>String#sub! with pattern and block</p>
<ul>
<li>modifies self in place and returns self</li>
<li>taints self if block's result is tainted</li>
<li>returns nil if no modifications were made</li>
<li>raises a RuntimeError if the string is modified while substituting</li>
<li>raises a RuntimeError when self is frozen</li>
</ul>
<p>String#gsub with pattern and replacement</p>
<ul>
<li>doesn't freak out when replacing ^</li>
<li>returns a copy of self with all occurrences of pattern replaced with replacement</li>
<li>ignores a block if supplied</li>
<li>supports \G which matches at the beginning of the remaining (non-matched) string</li>
<li>supports /i for ignoring case</li>
<li>doesn't interpret regexp metacharacters if pattern is a string</li>
<li>replaces \1 sequences with the regexp's corresponding capture</li>
<li>treats \1 sequences without corresponding captures as empty strings</li>
<li>replaces & and \0 with the complete match</li>
<li>replaces ` with everything before the current match</li>
<li>replaces ' with everything after the current match</li>
<li>replaces + with the last paren that actually matched</li>
<li>treats + as an empty string if there was no captures</li>
<li>maps \ in replacement to \</li>
<li>leaves unknown \x escapes in replacement untouched</li>
<li>leaves \ at the end of replacement untouched</li>
<li>taints the result if the original string or replacement is tainted</li>
<li>tries to convert pattern to a string using to_str</li>
<li>raises a TypeError when pattern can't be converted to a string</li>
<li>tries to convert replacement to a string using to_str</li>
<li>raises a TypeError when replacement can't be converted to a string</li>
<li>returns subclass instances when called on a subclass</li>
<li>sets $~ to MatchData of last match and nil when there's none</li>
</ul>
<p>String#gsub with pattern and block</p>
<ul>
<li>returns a copy of self with all occurrences of pattern replaced with the block's return value</li>
<li>sets $~ for access from the block</li>
<li>restores $~ after leaving the block</li>
<li>sets $~ to MatchData of last match and nil when there's none for access from outside</li>
<li>raises a RuntimeError if the string is modified while substituting</li>
<li>doesn't interpolate special sequences like \1 for the block's return value</li>
<li>converts the block's return value to a string using to_s</li>
<li>taints the result if the original string or replacement is tainted</li>
</ul>
<p>String#gsub! with pattern and replacement</p>
<ul>
<li>modifies self in place and returns self</li>
<li>taints self if replacement is tainted</li>
<li>returns nil if no modifications were made</li>
<li>raises a TypeError when self is frozen</li>
</ul>
<p>String#gsub! with pattern and block</p>
<ul>
<li>modifies self in place and returns self</li>
<li>taints self if block's result is tainted</li>
<li>returns nil if no modifications were made</li>
<li>raises a RuntimeError when self is frozen</li>
</ul>
<p>Finished in 0.030081 seconds</p>
<p>2 files, 82 examples, 251 expectations, 0 failures, 0 errors</p>
<p>--<br>
Magic is insufficiently advanced technology.</p>
<p>=end</p>