https://redmine.ruby-lang.org/https://redmine.ruby-lang.org/favicon.ico?17097754782011-11-08T11:04:32ZRuby Issue Tracking SystemRuby master - Feature #5588: add negation flag (v) to Regexphttps://redmine.ruby-lang.org/issues/5588?journal_id=219872011-11-08T11:04:32Zsunaku (Suraj Kurapati)sunaku@gmail.com
<ul></ul><p>Note that I am not asking for the ability to take an arbitrary regexp and construct its negated form (that is a hard problem). Instead, I am asking for a flag that simply inverts how a regexp is treated by the =~ and !~ operators (this is not a hard problem). The negation flag should not change the Regexp#source of a regexp.</p>
<p>Thanks for your consideration.</p> Ruby master - Feature #5588: add negation flag (v) to Regexphttps://redmine.ruby-lang.org/issues/5588?journal_id=219892011-11-08T11:53:14Zrkh (Konstantin Haase)me@rkh.im
<ul></ul><p>It is not to hard to negate a regexp, though. Simply wrap it as negative look-ahead.</p>
<p>Konstantin</p>
<p>On Nov 7, 2011, at 23:04 , Suraj Kurapati wrote:</p>
<blockquote>
<p>Issue <a class="issue tracker-2 status-5 priority-4 priority-default closed" title="Feature: add negation flag (v) to Regexp (Closed)" href="https://redmine.ruby-lang.org/issues/5588">#5588</a> has been updated by Suraj Kurapati.</p>
<p>Note that I am not asking for the ability to take an arbitrary regexp and construct its negated form (that is a hard problem). Instead, I am asking for a flag that simply inverts how a regexp is treated by the</p>
</blockquote> Ruby master - Feature #5588: add negation flag (v) to Regexphttps://redmine.ruby-lang.org/issues/5588?journal_id=219922011-11-08T13:54:09Zsunaku (Suraj Kurapati)sunaku@gmail.com
<ul></ul><p>Good idea, but negative lookahead isn't the same as negation:</p>
<p>$ irb</p>
<a name="ruby-193p0-2011-10-30-revision-33570-x86_64-linux"></a>
<h2 >ruby 1.9.3p0 (2011-10-30 revision 33570) [x86_64-linux]<a href="#ruby-193p0-2011-10-30-revision-33570-x86_64-linux" class="wiki-anchor">¶</a></h2>
<blockquote>
<blockquote>
<p>"rubyperl" =~ /(?!perl)/<br>
0<br>
"rubyperl" =~ /perl/<br>
4<br>
"rubyperl" !~ /perl/<br>
false</p>
</blockquote>
</blockquote>
<p>Thanks for your consideration.</p> Ruby master - Feature #5588: add negation flag (v) to Regexphttps://redmine.ruby-lang.org/issues/5588?journal_id=219932011-11-08T14:11:41Zmatz (Yukihiro Matsumoto)matz@ruby.or.jp
<ul></ul><p>How do you treat positional information with v flag?</p> Ruby master - Feature #5588: add negation flag (v) to Regexphttps://redmine.ruby-lang.org/issues/5588?journal_id=219952011-11-08T15:59:09Zalexeymuranov (Alexey Muranov)
<ul></ul><p>I would suggest to deal with it a new (sub)class instead: arbitrary boolean combinations of regexps, without regard for positional information.</p> Ruby master - Feature #5588: add negation flag (v) to Regexphttps://redmine.ruby-lang.org/issues/5588?journal_id=219982011-11-08T18:32:01Zsunaku (Suraj Kurapati)sunaku@gmail.com
<ul></ul><p>Hello Matz,</p>
<p>What do you mean by positional information:</p>
<ul>
<li>anchors (\A, ^, $, \G, etc.) ?</li>
<li>capture-group numbers ($1, $2, $3, etc.) ?</li>
<li>MatchData#begin, #end, #offset ?</li>
</ul>
<p>The v flag should only affect the low-level =~ and !~ operators in the C implementation. All subsequent processing should be performed as normal.</p>
<p>Thanks for your consideration.</p> Ruby master - Feature #5588: add negation flag (v) to Regexphttps://redmine.ruby-lang.org/issues/5588?journal_id=220212011-11-09T08:32:24Zsunaku (Suraj Kurapati)sunaku@gmail.com
<ul></ul><p>I tried to implement this feature in this patch:<br>
<a href="https://github.com/sunaku/ruby/commit/79305ba55c7ece5501c9219942eaf30e01a370a9" class="external">https://github.com/sunaku/ruby/commit/79305ba55c7ece5501c9219942eaf30e01a370a9</a></p>
<p>I was able to make Ruby recognize 'v' as an embedded regexp flag,<br>
and was able to create regexps via the new Regexp::NEGATED constant.</p>
<p>However, I am stuck on the following things. Any tips? :-)</p>
<ul>
<li>
<p>We can pass '(?v:)' in embedded regexp but it does not take effect.<br>
The resulting regexp object's #options field does not reflect 'v'.</p>
</li>
<li>
<p>Make the parser accept 'v' as option at the end of literal regexps.</p>
</li>
</ul>
<p>Thanks for your consideration.</p> Ruby master - Feature #5588: add negation flag (v) to Regexphttps://redmine.ruby-lang.org/issues/5588?journal_id=220242011-11-09T09:53:08Zakr (Akira Tanaka)akr@fsij.org
<ul></ul><p>2011/11/9 Suraj Kurapati <a href="mailto:sunaku@gmail.com" class="email">sunaku@gmail.com</a>:</p>
<blockquote>
<ul>
<li>We can pass '(?v:)' in embedded regexp but it does not take effect.<br>
The resulting regexp object's #options field does not reflect 'v'.</li>
</ul>
</blockquote>
<p>The option can be embedded to middle of a regexp: /foo(?v:bar)baz/</p>
<a name="I-think-it-doesnt-work"></a>
<h2 >I think it doesn't work.<a href="#I-think-it-doesnt-work" class="wiki-anchor">¶</a></h2>
<p>Tanaka Akira</p> Ruby master - Feature #5588: add negation flag (v) to Regexphttps://redmine.ruby-lang.org/issues/5588?journal_id=220422011-11-09T18:38:11Zsunaku (Suraj Kurapati)sunaku@gmail.com
<ul></ul><p>You are correct, Tanaka. I have been exploring the regexp implementation to understand why and I learned that the negate flag must be compiled as an opcode (similar to multiline and ignorecase) into the re_pattern_buffer->p string.</p>
<p>My current approach is to compile (?v:...) into two opcodes (OP_BEGIN_NEGATE and OP_END_NEGATE) with normal compilation of the ... stuff inside. Later, when match_at() in regexec.c:1254 is processing the compiled regexp and encounters OP_END_NEGATE, it will perform the negation:</p>
<p>If we consumed > 0 input characters since OP_BEGIN_NEGATE, then we have successfully matched the ... stuff inside the original (?v:...) regexp. So we need to stop further processing by returning ONIG_MISMATCH.</p>
<p>Otherwise, we did not consume any input characters since OP_BEGIN_NEGATE, so we continue processing by returning ONIG_NORMAL. This effectively treats the (?v:...) as a zero-length regexp, which always matches of course.</p>
<p>That's my plan for now. Any comments?</p>
<p>Thanks for your consideration.</p> Ruby master - Feature #5588: add negation flag (v) to Regexphttps://redmine.ruby-lang.org/issues/5588?journal_id=220682011-11-10T09:46:51Zsunaku (Suraj Kurapati)sunaku@gmail.com
<ul></ul><p>After several deep excursions into the regexp codebase, I've had enough. :)</p>
<p>As Tanaka(!) showed in 2007, negative <em>global</em> regexps are already there:</p>
<a name="httpwwwruby-forumcomtopic133413595368"></a>
<h2 ><a href="http://www.ruby-forum.com/topic/133413#595368" class="external">http://www.ruby-forum.com/topic/133413#595368</a><a href="#httpwwwruby-forumcomtopic133413595368" class="wiki-anchor">¶</a></h2>
<blockquote>
<blockquote>
<p>"rubyperl" =~ /^((?!perl).)+$/<br>
nil</p>
</blockquote>
</blockquote>
<p>So you can close this feature request now. Sorry for the noise.</p> Ruby master - Feature #5588: add negation flag (v) to Regexphttps://redmine.ruby-lang.org/issues/5588?journal_id=220702011-11-10T10:32:53Zsunaku (Suraj Kurapati)sunaku@gmail.com
<ul></ul><p>Alas, I was unable to resist the lure of implementing this, so I'm back to give this another try.</p>
<p>My current approach is to expand (?v:STUFF) into (?:(?!STUFF).) when the regexp AST is built.</p>
<p>The goal is to have an negated embedded regexp say: "there is <em>something</em> here that is NOT this".</p> Ruby master - Feature #5588: add negation flag (v) to Regexphttps://redmine.ruby-lang.org/issues/5588?journal_id=221002011-11-11T08:38:29Zsunaku (Suraj Kurapati)sunaku@gmail.com
<ul></ul><p>I did it! ^_^ Please take a look:</p>
<p><a href="https://github.com/sunaku/ruby/compare/5588_regexp_v" class="external">https://github.com/sunaku/ruby/compare/5588_regexp_v</a></p>
<p>There are a few issues remaining with the implementation:</p>
<ul>
<li>
<p>Store snegate on STACK support nested embedded negated regexps.</p>
</li>
<li>
<p>Find a better way to double-pop the stack in OP_NEGATE_END handler in regexec.c; currently it puts undefined values into s, p, etc. when the stack is popped for the second time in "goto fail" handler.</p>
</li>
</ul>
<p>After that, I need to make Ruby parser accept /.../v as a literal regexp flag.</p>
<p>Thanks for your consideration.</p> Ruby master - Feature #5588: add negation flag (v) to Regexphttps://redmine.ruby-lang.org/issues/5588?journal_id=221022011-11-11T09:00:25Zsunaku (Suraj Kurapati)sunaku@gmail.com
<ul></ul><p>Allow me to explain the current embedded negated regexp implementation.</p>
<p>When parsing an embedded negated regexp (?v:r), we expand them into this:</p>
<p>OP_NEGATE_START(?:r)?OP_NEGATE_END.*?</p>
<p>Here, OP_NEGATE_START and OP_NEGATE_END are opcodes in compiled pattern.</p>
<p>When the regexp engine (see match_at() in regexec.c) reaches OP_NEGATE_START, we store the current state of input (up to which character of the input string have we consumed/matched so far?) in the "snegate" variable.</p>
<p>The regexp engine then continues onward and eventually reaches OP_NEGATE_END. At this point, I compare the current state of input with "snegate". This tells us if the original embedded negated regexp (?v:r) has matched anything. Now we perform the negation:</p>
<p>If (?v:r) matched something, then treat this as a mismatch (prevent backtrack and goto fail). Otherwise, continue processing (the ".*?" after OP_NEGATE_END will take care of consuming any non-matching characters so that we can still proceed).</p>
<pre><code>assert_no_match(/a(?v:b)c/, "abc")
assert_match(/a(?v:b)c/, "axc")
assert_match(/a(?v:b)c/, "ac")
assert_match(/a(?v:b)c/, "axbc")
</code></pre>
<p>I hope this helps you understand my approach. Please correct me if I made a mistake.</p>
<p>Thanks for your consideration.</p> Ruby master - Feature #5588: add negation flag (v) to Regexphttps://redmine.ruby-lang.org/issues/5588?journal_id=221322011-11-12T17:31:15Zsunaku (Suraj Kurapati)sunaku@gmail.com
<ul></ul><p>I have updated my patch to emit a single OP_NEGATE opcode after the negated embedded regexp (?v:...). This opcode double-pops the stack to prevent the optional alternation from succeeding. Please take a look:</p>
<p><a href="https://github.com/sunaku/ruby/compare/5588_regexp_v" class="external">https://github.com/sunaku/ruby/compare/5588_regexp_v</a></p>
<p>There might be a bug in the "(?v:r)" to "(?:rN)?" expansion (where "N" is OP_NEGATE) because DONIG_DEBUG_PARSE_TREE shows the expanded "(?:rN)?" twice:</p>
<p>PATTERN: /a(?v:b)c/ (ASCII-8BIT)<br>
<a href="list:965430" class="external">list:965430</a><br>
<a href="string:96af40" class="external">string:96af40</a>a<br>
<a href="enclose:965570" class="external">enclose:965570</a> option:4096<br>
<a href="quantifier:965520" class="external">quantifier:965520</a>{0,1}<br>
<a href="string:965480" class="external">string:965480</a>b</p>
<pre><code> <quantifier:965520>{0,1} <=== BUG? Why is this here twice?
<string:965480>b
</code></pre>
<p><a href="quantifier:96c830" class="external">quantifier:96c830</a>{0,-1}?<br>
<a href="anychar:96ae00" class="external">anychar:96ae00</a><br>
<a href="string:96c920" class="external">string:96c920</a>c</p>
<p>I will iron out these issues and finish the implementation in due time.</p>
<p>Cheers.</p> Ruby master - Feature #5588: add negation flag (v) to Regexphttps://redmine.ruby-lang.org/issues/5588?journal_id=221362011-11-12T18:26:12Zsunaku (Suraj Kurapati)sunaku@gmail.com
<ul></ul><p>The double-printing of ENCLOSE_OPTION node was a bug in Oniguruma 5.9.2 and not in my code, for once! ;) I have submitted a fix for that bug to Kosako, the author of Oniguruma, accordingly. In case you are interested, here is the bug fix:</p>
<p><a href="https://github.com/sunaku/ruby/commit/125c31a0fe42fb2937ea64c2f31283b81bb32d8b" class="external">https://github.com/sunaku/ruby/commit/125c31a0fe42fb2937ea64c2f31283b81bb32d8b</a></p>
<p>Cheers.</p> Ruby master - Feature #5588: add negation flag (v) to Regexphttps://redmine.ruby-lang.org/issues/5588?journal_id=221522011-11-13T18:59:30Zsunaku (Suraj Kurapati)sunaku@gmail.com
<ul><li><strong>File</strong> <a href="/attachments/2229">5588_regexp_v.patch</a> <a class="icon-only icon-download" title="Download" href="/attachments/download/2229/5588_regexp_v.patch">5588_regexp_v.patch</a> added</li></ul><p>I fixed the 'v' flag parsing in literal regexps: the problem was the value of ONIG_OPTION_NEGATE that I chose (0x1000) collided with RB_ENCODING_OPTION mask (0xFF00).</p>
<p>Now the implementation is finally finished. Please review the attached patch and tell me what you think. If you like it, I will update the RDoc of the Regexp functions to reflect the new 'v' flag.</p>
<p>Thanks for your consideration.</p> Ruby master - Feature #5588: add negation flag (v) to Regexphttps://redmine.ruby-lang.org/issues/5588?journal_id=221642011-11-14T06:27:02Zsunaku (Suraj Kurapati)sunaku@gmail.com
<ul></ul><p>I beautified my patches so they are easier to understand (especially the additions to test_regexp.rb in Ruby's test suite) and extracted the Oniguruma-only portions into a separate patch so that you can see what changes affect Ruby vs. Oniguruma. Here are the patches:</p>
<ul>
<li>
<p>Negated regexps in Oniguruma 5.9.2 (already submitted to Kosako):<br>
<a href="https://github.com/sunaku/onig-5.9.2/compare/v5.9.2...master" class="external">https://github.com/sunaku/onig-5.9.2/compare/v5.9.2...master</a></p>
</li>
<li>
<p>Negated regexps in Oniguruma 5.9.2 plus integration in Ruby trunk:<br>
<a href="https://github.com/sunaku/ruby/compare/5588_regexp_v" class="external">https://github.com/sunaku/ruby/compare/5588_regexp_v</a></p>
</li>
</ul>
<p>I would like your feedback on these. Thanks for your consideration.</p> Ruby master - Feature #5588: add negation flag (v) to Regexphttps://redmine.ruby-lang.org/issues/5588?journal_id=221692011-11-14T10:20:21Zsunaku (Suraj Kurapati)sunaku@gmail.com
<ul></ul><p>I have explained this implementation in more detail on my blog:</p>
<p><a href="http://snk.tuxfamily.org/log/oniguruma-negated-regexps.html" class="external">http://snk.tuxfamily.org/log/oniguruma-negated-regexps.html</a></p>
<p>I hope that helps. Cheers. :)</p> Ruby master - Feature #5588: add negation flag (v) to Regexphttps://redmine.ruby-lang.org/issues/5588?journal_id=222012011-11-15T04:55:33Zsunaku (Suraj Kurapati)sunaku@gmail.com
<ul><li><strong>File</strong> <a href="/attachments/2238">0001-http-redmine.ruby-lang.org-issues-5588.patch</a> <a class="icon-only icon-download" title="Download" href="/attachments/download/2238/0001-http-redmine.ruby-lang.org-issues-5588.patch">0001-http-redmine.ruby-lang.org-issues-5588.patch</a> added</li></ul> Ruby master - Feature #5588: add negation flag (v) to Regexphttps://redmine.ruby-lang.org/issues/5588?journal_id=222082011-11-15T09:45:08Znaruse (Yui NARUSE)naruse@airemix.jp
<ul></ul><p>With your <a href="/issues/5588">[ruby-core:41040]</a>'s patch, I got following result.<br>
Is this an expected result?</p>
<p>irb(main):029:0> /a(?v:b)c/=~"abc"<br>
=> nil<br>
irb(main):030:0> /a(?v:b)c/=~"ab_c"<br>
=> nil<br>
irb(main):031:0> /a(?v:b)c/=~"a_bc"<br>
=> 0</p> Ruby master - Feature #5588: add negation flag (v) to Regexphttps://redmine.ruby-lang.org/issues/5588?journal_id=222182011-11-15T15:17:18Zsunaku (Suraj Kurapati)sunaku@gmail.com
<ul></ul><p>Hi Naruse,</p>
<p>Thanks for trying my patch and for your questions! :)</p>
<p>Yui NARUSE wrote:</p>
<blockquote>
<p>With your <a href="/issues/5588">[ruby-core:41040]</a>'s patch, I got following result.<br>
Is this an expected result?</p>
</blockquote>
<p>Yes, please allow me to explain why:</p>
<blockquote>
<p>irb(main):029:0> /a(?v:b)c/=~"abc"<br>
=> nil</p>
</blockquote>
<p>First /a/ matched "a", then /(?v:b)/ mismatched "b". Failure.</p>
<blockquote>
<p>irb(main):030:0> /a(?v:b)c/=~"ab_c"<br>
=> nil</p>
</blockquote>
<p>First /a/ matched "a", then /(?v:b)/ mismatched "b". Failure.</p>
<blockquote>
<p>irb(main):031:0> /a(?v:b)c/=~"a_bc"<br>
=> 0</p>
</blockquote>
<p>First /a/ matched "a", then /(?v:b)/ matched "_", then /.<em>?/<br>
(created when /(?v:b)/ was expanded into /(?:bN)?.</em>?/ where N is<br>
OP_NEGATE) matched "b", and finally /c/ matched "c". Success.</p>
<p>See also my explanation of partly negated regexp expansion:<br>
<a href="http://snk.tuxfamily.org/log/oniguruma-negated-regexps.html" class="external">http://snk.tuxfamily.org/log/oniguruma-negated-regexps.html</a></p>
<p>Thanks for your consideration.</p> Ruby master - Feature #5588: add negation flag (v) to Regexphttps://redmine.ruby-lang.org/issues/5588?journal_id=222192011-11-15T15:27:40Zsunaku (Suraj Kurapati)sunaku@gmail.com
<ul></ul><p>Suraj Kurapati wrote:</p>
<blockquote>
<p>Yui NARUSE wrote:</p>
<blockquote>
<p>irb(main):031:0> /a(?v:b)c/=~"a_bc"<br>
=> 0</p>
</blockquote>
<p>then /(?v:b)/ matched "_"</p>
</blockquote>
<p>Hmm, that explanation isn't fully accurate. Let me try again:</p>
<p>First /a/ matched "a", then /(?:bN)?.*?/ (which is the parse-tree<br>
expansion of /(?v:b)/) matched "_b", and finally /c/ matched "c".<br>
Success.</p>
<p>Sorry for the confusion.</p> Ruby master - Feature #5588: add negation flag (v) to Regexphttps://redmine.ruby-lang.org/issues/5588?journal_id=222242011-11-15T18:08:06Znaruse (Yui NARUSE)naruse@airemix.jp
<ul><li><strong>Category</strong> set to <i>core</i></li><li><strong>Status</strong> changed from <i>Open</i> to <i>Assigned</i></li><li><strong>Assignee</strong> set to <i>naruse (Yui NARUSE)</i></li><li><strong>Target version</strong> set to <i>2.0.0</i></li></ul><p>Thank you for detailed explanation,<br>
And sorry, <a href="/issues/5588">[ruby-core:40932]</a> explains it.<br>
I believe this spec is wrong; (?v:foo) should match a sequence which doesn't include "foo".<br>
OR a sequence just match "foo".</p>
<p>For example, /"(?v:<|&|")"/, AttValue of XML <a href="http://www.w3.org/TR/xml/#NT-AttValue" class="external">http://www.w3.org/TR/xml/#NT-AttValue</a><br>
The behavior of this regexp seems /"[^<&"]*"/ but</p>
<p>irb(main):007:0> /"[^<&"]*"/=~'"aa<&a"'<br>
=> nil<br>
irb(main):008:0> /"(?v:<|&|")"/=~'"aa<&a"'<br>
=> 0</p>
<p>This feels strange.</p>
<p>Do you have any use case which show current behavior is more reasonable?</p> Ruby master - Feature #5588: add negation flag (v) to Regexphttps://redmine.ruby-lang.org/issues/5588?journal_id=222332011-11-16T04:46:53Zsunaku (Suraj Kurapati)sunaku@gmail.com
<ul></ul><p>I don't have a use case for the current behavior; it was just the<br>
simplest way to remove non-matching input characters that obstructed<br>
the matching engine. And I agree with your examples; people would<br>
naturally think of /(?v:...)/ as a glorified form of /[^...]/.</p>
<p>The solution is to change the parse-tree expansion into this:</p>
<p>/(?v:r)/ => /(?:(?:rN)?.)/</p>
<p>In this manner, the /(?:rN)/ acts as a barrier that only allows<br>
input characters that <em>do not</em> match <code>r</code> to be matched by the /./.</p>
<p>However, this seems very similar to Tanaka's 2007 solution<a href="http://www.ruby-forum.com/topic/133413#595368" class="external">1</a>:</p>
<p>/(?v:r)/ => /(?:(?!r).)/</p>
<p>I will play with Oniguruma in GDB some more to learn how (?!) works.<br>
Perhaps my OP_NEGATE modification is actually unnecessary. Cheers.</p> Ruby master - Feature #5588: add negation flag (v) to Regexphttps://redmine.ruby-lang.org/issues/5588?journal_id=222352011-11-16T09:31:19Zsunaku (Suraj Kurapati)sunaku@gmail.com
<ul></ul><p>Hello Naruse,</p>
<p>I have updated my patch to expand /(?v:r)/ into /(?:(?!rN).)/:<br>
<a href="https://github.com/sunaku/ruby/compare/5588_regexp_v" class="external">https://github.com/sunaku/ruby/compare/5588_regexp_v</a></p>
<p>It seems that OP_NEGATE is necessary after all, because without it,<br>
Oniguruma will try to match a non-anchored partly negated regexp to<br>
the rest of the input string.</p>
<p>For example, when processing /(?v:ruby)/ =~ "ruby", Oniguruma does:</p>
<ol>
<li>/(?v:ruby)/ =~ "ruby" # failure</li>
<li>/(?v:ruby)/ =~ "uby" # success! return</li>
<li>/(?v:ruby)/ =~ "by" # success! (illustation)</li>
<li>/(?v:ruby)/ =~ "y" # success! (illustation)</li>
<li>/(?v:ruby)/ =~ "" # failure (illustation)</li>
</ol>
<p>Of course, Oniguruma stops at the first success (step 2). I added<br>
the rest of the steps to illustrate how it continues trying to match<br>
the rest of the input when a non-anchored regexp fails.</p>
<p>I had encountered this problem previously when coding OP_NEGATE, and<br>
solved it by returning a special value (ONIG_MISMATCH_FROM_NEGATE).<br>
Now I simply re-used that existing logic for (?!) expanded in (?v:).</p>
<p>I have added your examples to the test_regexp.rb suite now.</p>
<p>Thanks for your consideration.</p> Ruby master - Feature #5588: add negation flag (v) to Regexphttps://redmine.ruby-lang.org/issues/5588?journal_id=222362011-11-16T14:03:46Zsunaku (Suraj Kurapati)sunaku@gmail.com
<ul></ul><p>I am uncertain whether /(?v:ruby)/ =~ "ruby" should return nil or 1.</p>
<p>What do you think?</p> Ruby master - Feature #5588: add negation flag (v) to Regexphttps://redmine.ruby-lang.org/issues/5588?journal_id=222372011-11-16T14:42:28Zsunaku (Suraj Kurapati)sunaku@gmail.com
<ul><li><strong>File</strong> <a href="/attachments/2244">5588_negative_lookahead.patch</a> <a class="icon-only icon-download" title="Download" href="/attachments/download/2244/5588_negative_lookahead.patch">5588_negative_lookahead.patch</a> added</li></ul><p>Alright, I have decided that partly negated regexps should behave like Tanaka's 2007 solution. They are easier to reason about (consistently) in that form:</p>
<p>/"(?v:ruby)+"/ =~ %q("ruby") # yields nil<br>
/"(?v:ruby)+"/ =~ %q("rubyperl") # yields nil<br>
/"(?v:ruby)+"/ =~ %q("perlruby") # yields nil<br>
/"(?v:ruby)+"/ =~ %q(abc"perlru-by"xyz) # yields 3 and $& is %q("perlru-by")</p>
<p>Please review my simplified patch (attached) that no longer uses OP_NEGATE.</p>
<p>Thanks for your consideration.</p> Ruby master - Feature #5588: add negation flag (v) to Regexphttps://redmine.ruby-lang.org/issues/5588?journal_id=222432011-11-17T07:50:45Zsunaku (Suraj Kurapati)sunaku@gmail.com
<ul><li><strong>File</strong> <a href="/attachments/2245">5588_negative_lookahead.patch</a> <a class="icon-only icon-download" title="Download" href="/attachments/download/2245/5588_negative_lookahead.patch">5588_negative_lookahead.patch</a> added</li></ul><p>Attaching patch with updated test case to illustrate<br>
how unanchored partly negated regexp matching works:</p>
<pre><code>/(?v:ruby)/ =~ "ruby" #=> 1
["r", "u", "by"] == [$`, $&, $'] #=> true
</code></pre>
<p>Thanks for your consideration.</p> Ruby master - Feature #5588: add negation flag (v) to Regexphttps://redmine.ruby-lang.org/issues/5588?journal_id=222502011-11-17T17:43:22Znaruse (Yui NARUSE)naruse@airemix.jp
<ul></ul><p>I doubt this function under the current implementation should be flags because /ruby/v =~ "ruby" is now useless example.<br>
I think people who want to use this negation flag should simply write (?:(?!r).).</p>
<p>Moreover it has a bug, for example over <a href="http://www.ruby-forum.com/topic/133413" class="external">http://www.ruby-forum.com/topic/133413</a><br>
If the suffix is not "dog" but "tv", the regexp may be /cat((?:(?!cat).)*)tv/.<br>
But as following, it has false negative.</p>
<p>irb(main):013:0> /cat((?:(?!cat).)*)tv/=~"cat foo bar catv"<br>
=> nil</p>
<p>The missing piece of your proposal is a use case.<br>
All existing examples are too artificial.<br>
Design should prior to implementations, and use cases should prior to designs.</p>
<p>I'm interesting in the idea negation flag.<br>
But your proposal is limited by implementation.</p>
<p>Use cases I know is</p>
<ul>
<li>comments of C Language: /* ... */</li>
<li>SGML CDATA: </li>
<li>HTML 2.0 (RFC 1866) 3.2.5. Comments: /<!(--[^\-]<em>(-[^\-]+)</em>--)*>/</li>
<li>HTML 4.0/XML Comments: //</li>
<li>HTTP header: until CRLFCRLF</li>
<li>sequences (lines, sentences, paragraphs, and so on) which doesn't include a word</li>
</ul> Ruby master - Feature #5588: add negation flag (v) to Regexphttps://redmine.ruby-lang.org/issues/5588?journal_id=222602011-11-18T01:33:35Zsunaku (Suraj Kurapati)sunaku@gmail.com
<ul></ul><p>Yui NARUSE wrote:</p>
<blockquote>
<p>I doubt this function under the current implementation should be<br>
flags because /ruby/v =~ "ruby" is now useless example. I think<br>
people who want to use this negation flag should simply write<br>
(?:(?!r).).</p>
</blockquote>
<p>I respectfully disagree; for me, wholly negated regexps are more<br>
useful than partly negated ones. Please see my use cases below.</p>
<blockquote>
<p>Moreover it has a bug, for example over<br>
<a href="http://www.ruby-forum.com/topic/133413" class="external">http://www.ruby-forum.com/topic/133413</a> If the suffix is not "dog"<br>
but "tv", the regexp may be /cat((?:(?!cat).)*)tv/. But as<br>
following, it has false negative.</p>
<p>irb(main):013:0> /cat((?:(?!cat).)*)tv/=~"cat foo bar catv"<br>
=> nil</p>
</blockquote>
<p>It works if you add a word-boundary anchor at the end of "cat":</p>
<blockquote>
<blockquote>
<p>/cat((?:(?!cat\b).)*)tv/=~"cat foo bar catv"<br>
0<br>
$&<br>
"cat foo bar catv"</p>
</blockquote>
</blockquote>
<blockquote>
<p>The missing piece of your proposal is a use case. All existing<br>
examples are too artificial. Design should prior to<br>
implementations, and use cases should prior to designs.</p>
</blockquote>
<p>Very true, thanks for this much needed criticism.</p>
<blockquote>
<p>I'm interesting in the idea negation flag. But your proposal is<br>
limited by implementation.</p>
</blockquote>
<p>I only have use cases for wholly negated regexps (/.../v):</p>
<ul>
<li>some_enumerable.grep(/.../v)</li>
<li>some_string =~ some_regexp # where some_regexp given by user</li>
<li>case some_string; when /.../v; end</li>
</ul>
<p>That is why I became confused when implementing partly negated<br>
regexps (/(?v:)/).</p>
<blockquote>
<p>Use cases I know is</p>
<ul>
<li>comments of C Language: /* ... */</li>
<li>SGML CDATA: </li>
<li>HTML 2.0 (RFC 1866) 3.2.5. Comments: /<!(--[^\-]<em>(-[^\-]+)</em>--)*>/</li>
<li>HTML 4.0/XML Comments: //</li>
<li>HTTP header: until CRLFCRLF</li>
<li>sequences (lines, sentences, paragraphs, and so on) which doesn't include a word</li>
</ul>
</blockquote>
<p>These seem like good use cases of partly negated regexps (/(?v:)/).</p> Ruby master - Feature #5588: add negation flag (v) to Regexphttps://redmine.ruby-lang.org/issues/5588?journal_id=224282011-11-26T21:15:10Znaruse (Yui NARUSE)naruse@airemix.jp
<ul></ul><p>Suraj Kurapati wrote:</p>
<blockquote>
<p>Yui NARUSE wrote:</p>
<blockquote>
<p>I doubt this function under the current implementation should be<br>
flags because /ruby/v =~ "ruby" is now useless example. I think<br>
people who want to use this negation flag should simply write<br>
(?:(?!r).).</p>
</blockquote>
<p>I respectfully disagree; for me, wholly negated regexps are more<br>
useful than partly negated ones. Please see my use cases below.</p>
</blockquote>
<p>Ah, you think /v is still wholly negated regexp, i see.</p>
<blockquote>
<blockquote>
<p>Moreover it has a bug, for example over<br>
<a href="http://www.ruby-forum.com/topic/133413" class="external">http://www.ruby-forum.com/topic/133413</a> If the suffix is not "dog"<br>
but "tv", the regexp may be /cat((?:(?!cat).)*)tv/. But as<br>
following, it has false negative.</p>
<p>irb(main):013:0> /cat((?:(?!cat).)*)tv/=~"cat foo bar catv"<br>
=> nil</p>
</blockquote>
<p>It works if you add a word-boundary anchor at the end of "cat":</p>
<blockquote>
<blockquote>
<p>/cat((?:(?!cat\b).)*)tv/=~"cat foo bar catv"<br>
0<br>
$&<br>
"cat foo bar catv"</p>
</blockquote>
</blockquote>
</blockquote>
<p>This \b hack can only work when "t" and "v" is the same kind.<br>
When replace "v" to "!", this won't work.</p>
<blockquote>
<blockquote>
<p>/cat((?:(?!cat\b).)*)t!/=~"cat foo bar cat!"<br>
=> nil</p>
</blockquote>
</blockquote>
<blockquote>
<blockquote>
<p>The missing piece of your proposal is a use case. All existing<br>
examples are too artificial. Design should prior to<br>
implementations, and use cases should prior to designs.</p>
</blockquote>
<p>Very true, thanks for this much needed criticism.</p>
<blockquote>
<p>I'm interesting in the idea negation flag. But your proposal is<br>
limited by implementation.</p>
</blockquote>
<p>I only have use cases for wholly negated regexps (/.../v):</p>
<ul>
<li>some_enumerable.grep(/.../v)</li>
<li>some_string =~ some_regexp # where some_regexp given by user</li>
<li>case some_string; when /.../v; end</li>
</ul>
<p>That is why I became confused when implementing partly negated<br>
regexps (/(?v:)/).</p>
</blockquote>
<p>They seems reasonable.<br>
If you suggested only wholly one with such use case, this discussion would be more simple.</p> Ruby master - Feature #5588: add negation flag (v) to Regexphttps://redmine.ruby-lang.org/issues/5588?journal_id=225592011-12-03T06:22:42Zsunaku (Suraj Kurapati)sunaku@gmail.com
<ul></ul><p>Interesting. Thanks for your feedback. I will submit a new patch that only contains wholly negated regexps (/.../v) this weekend. Cheers.</p> Ruby master - Feature #5588: add negation flag (v) to Regexphttps://redmine.ruby-lang.org/issues/5588?journal_id=270962012-06-08T21:43:47Znaruse (Yui NARUSE)naruse@airemix.jp
<ul><li><strong>Status</strong> changed from <i>Assigned</i> to <i>Feedback</i></li></ul> Ruby master - Feature #5588: add negation flag (v) to Regexphttps://redmine.ruby-lang.org/issues/5588?journal_id=271282012-06-10T00:10:16Znobu (Nobuyoshi Nakada)nobu@ruby-lang.org
<ul></ul><p>=begin<br>
What will (({Regexp#match})) with (({v})) flag return, and what will set to (({$~}))?<br>
=end</p> Ruby master - Feature #5588: add negation flag (v) to Regexphttps://redmine.ruby-lang.org/issues/5588?journal_id=315742012-10-25T23:22:38Zyhara (Yutaka HARA)
<ul><li><strong>Target version</strong> changed from <i>2.0.0</i> to <i>2.6</i></li></ul> Ruby master - Feature #5588: add negation flag (v) to Regexphttps://redmine.ruby-lang.org/issues/5588?journal_id=521082015-04-11T05:56:26Zakr (Akira Tanaka)akr@fsij.org
<ul><li><strong>Related to</strong> <i><a class="issue tracker-2 status-5 priority-4 priority-default closed" href="/issues/11049">Feature #11049</a>: Enumerable#grep_v (inversed grep)</i> added</li></ul> Ruby master - Feature #5588: add negation flag (v) to Regexphttps://redmine.ruby-lang.org/issues/5588?journal_id=629532017-02-12T12:40:00Zk_takata (Ken Takata)
<ul></ul><p>Onigmo 6.1.1 was merged by r57603.<br>
It supports absent operator <code>(?~pattern)</code> which can be a replacement of <code>(?v:pattern)</code>.</p> Ruby master - Feature #5588: add negation flag (v) to Regexphttps://redmine.ruby-lang.org/issues/5588?journal_id=666942017-09-15T14:11:07Znaruse (Yui NARUSE)naruse@airemix.jp
<ul><li><strong>Status</strong> changed from <i>Feedback</i> to <i>Closed</i></li></ul>