Project

General

Profile

Actions

Bug #14101

closed

Unreliable handling of groups nested within absent/absence operator of regex

Added by tom-lord (Tom Lord) almost 5 years ago. Updated almost 5 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Target version:
-
ruby -v:
2.5.0
[ruby-core:83743]

Description

The new absent/absence regex operator, added to Onigmo and bundled into ruby since v2.4.1, supports nested groupings such as:

"abb".match /(?~(a|b)b)/
 => #<MatchData "a" 1:"a">

However, under some scenarios (I haven't been able to determine the exact cause), the execution fails:

"abb".match /(?~(a|c)c)/
ArgumentError: negative string size (or size too big)
from (irb):1:in `scan'

Interestingly, when running the above in pry, we see some malformed object created:

"abb".match /(?~(a|c)c)/
#=> #<MatchData "abb" 1:#<MatchData:0x3fd47ec398d4>

"abb".scan /(?~(a|c)c)/
#=> ArgumentError: negative string size (or size too big)

I am unclear whether this bug belongs in the ruby project, or Onigmo.
Documentation on the operator is still a work in progress (https://github.com/k-takata/Onigmo/issues/87); perhaps nested groups should not be allowed by the engine?

Updated by tom-lord (Tom Lord) almost 5 years ago

Here's a slightly more minimal reproduction example:

"abb".match /(?~(a)c)/
#=> ArgumentError: negative string size (or size too big)

My best guess is that the regexp engine is caught in an unexpected state, where the capture group still references an orphaned object?

Actions #2

Updated by nobu (Nobuyoshi Nakada) almost 5 years ago

  • Status changed from Open to Closed

Applied in changeset trunk|r60755.


regexec.c: invalidate previously matched position

  • regexec.c (match_at): invalidate end position not yet matched
    when new start position is pushed, to dispose previously stored
    position. [ruby-core:83743] [Bug #14101]
Actions

Also available in: Atom PDF