Bug #12728
closedNegative lookahead does not work for "+" even though works for "@"
Description
I'll attach a test program that shows the effect. Basically, if I have a negative lookahead in the regex like (?!@) and "@" shows up in the proper location I get a mismatch (1. case). This is expected. If I exchange the "@" with a "+" or "[+]" in the regex and a "+" in the input, a match occurs (case 2 and 3). This is the bug. If the "+" or "@" is removed from the string an expected match occurs (case 4 and 5). I was not able to boil this down to a smaller example yet.
Files
Updated by naruse (Yui NARUSE) about 8 years ago
- Status changed from Open to Rejected
In case 2, the regexp just behave as if
t %r{
(?<!\\)\( # outer bracket
o\+
(?<!\\) ([+*]|\{\d+,\}) (?!\+) # inner repetition, non possessive
.*
(?<!\\)\) # outer bracket
(?<!\\) (?:[+*]|\{\d+,\}) # unbounded repetition, non possessive
}x, "f(o++)+"
Of course it matches.
Maybe use should [a-zA-Z0-9]*
or something instead of .*
.
Updated by rklemme (Robert Klemme) about 8 years ago
- File rx-mini.rb rx-mini.rb added
Yui NARUSE wrote:
In case 2, the regexp just behave as if
t %r{ (?<!\\)\( # outer bracket o\+ (?<!\\) ([+*]|\{\d+,\}) (?!\+) # inner repetition, non possessive .* (?<!\\)\) # outer bracket (?<!\\) (?:[+*]|\{\d+,\}) # unbounded repetition, non possessive }x, "f(o++)+"
Of course it matches.
Argh! Stupid me. Yes, the negative lookahead will also match with the closing bracket.
Maybe use should
[a-zA-Z0-9]*
or something instead of.*
.
I included a closing bracket in the negative lookahead, then it works:
$ diff -U3 x1 x2
--- x1 2016-10-01 10:55:50.595060831 +0200
+++ x2 2016-10-01 10:55:44.459048792 +0200
@@ -13,20 +13,19 @@
/x
-MATCH
+NO MATCH
s = 'f(o++)+'
rx = /
(?<!\)( # outer bracket
(.*)
- (?<!\) ([+*]|{\d+,}) (?!+) # inner repetition, non possessive
-
(?<!\) ([+*]|{\d+,}) (?!+|)) # inner repetition, non possessive
(.)
(?<!\)) # outer bracket
(?<!\) (?:[+]|{\d+,}) # unbounded repetition, non possessive
/x
-match = #<MatchData "(o++)+" 1:"o+" 2:"+" 3:"">
MATCH
s = 'f(o++)+'
The following .* blinded me for the fact that the closing bracket can match with ) AND fulfill the lookahead like in
irb(main):001:0> /a(?=b)b/.match "abc"
=> #<MatchData "ab">
I am sorry for the hassle.
Kind regards
robert
PS: Attaching the test version that produced output x2.