Bug #14103
closedRegexp absense operator has no chance to ^C
Description
Following script hangs, with no respond to ^C
.
/(?<x> (?<! a ) a+ ){0}
(?<y> (?~ \g<z> ) ){0}
(?<z> (?<! a ) \k<x> (?! a ) ){0}
\g<x> \g<y> \g<z>
/xo =~ (1..1024).map{|x| 'b' + 'a' * x }.join
Updated by jeremyevans0 (Jeremy Evans) about 3 years ago
I submitted a pull request to fix this: https://github.com/ruby/ruby/pull/4960
The issue is unlikely to be specific to the absence operator, I think it affects any case where a regexp takes a long time due to backtracking. In addition to allowing interrupts, the pull request also allows yielding to other threads during a long regexp match (since checking for interrupts has that effect).
Updated by jeremyevans (Jeremy Evans) almost 3 years ago
- Status changed from Open to Closed
Applied in changeset git|edc8576a65b7082597d45a694434261ec3ac0d9e.
Allow interrupting regexps that backtrack
Fixes [Bug #14103]
Co-authored-by: Nobuyoshi Nakada nobu@ruby-lang.org
Updated by mame (Yusuke Endoh) almost 3 years ago
- Status changed from Closed to Open
This change degrades the performance of regular expression matching when frequent backtracking occurs.
Before edc8576a65b7082597d45a694434261ec3ac0d9e
$ time ./miniruby -ve '/^a*b?a*$/ =~ "a" * 20000 + "x"'
ruby 3.2.0dev (2022-03-10T19:06:33Z master edc8576a65) [x86_64-linux]
real 0m3.824s
user 0m3.820s
sys 0m0.004s
After edc8576a65b7082597d45a694434261ec3ac0d9e
$ time ./miniruby -ve '/^a*b?a*$/ =~ "a" * 20000 + "x"'
ruby 3.2.0dev (2022-03-10T19:06:33Z master edc8576a65) [x86_64-linux]
real 0m4.608s
user 0m4.588s
sys 0m0.016s
I have no idea if this may lead to any actual problem, but how about reducing the frequency of rb_thread_check_ints? This PR makes the check only once every 128 backtracks.
https://github.com/ruby/ruby/pull/5697
This restores the original performance.
$ time ./miniruby -ve '/^a*b?a*$/ =~ "a" * 20000 + "x"'
ruby 3.2.0dev (2022-03-23T14:55:49Z master 8f1c69f27c) [x86_64-linux]
real 0m3.702s
user 0m3.696s
sys 0m0.000s
Still, it allows immediate interrupts.
$ ./miniruby -e '/^a*b?a*$/ =~ "a" * 20000 + "x"'
^C-e:1:in `<main>': Interrupt
$ ./miniruby -e "/(?<x> (?<! a ) a+ ){0}
(?<y> (?~ \g<z> ) ){0}
(?<z> (?<! a ) \k<x> (?! a ) ){0}
\g<x> \g<y> \g<z>
/xo =~ (1..1024).map{|x| 'b' + 'a' * x }.join"
^C-e:5:in `<main>': Interrupt
Updated by mame (Yusuke Endoh) almost 3 years ago
- Status changed from Open to Closed
Fixed at 9112cf4ae7f7ea8ab33c282aa02eec812421aeab.