https://redmine.ruby-lang.org/https://redmine.ruby-lang.org/favicon.ico?17113305112021-10-13T15:48:59ZRuby Issue Tracking SystemRuby master - Bug #14103: Regexp absense operator has no chance to ^Chttps://redmine.ruby-lang.org/issues/14103?journal_id=941222021-10-13T15:48:59Zjeremyevans0 (Jeremy Evans)merch-redmine@jeremyevans.net
<ul></ul><p>I submitted a pull request to fix this: <a href="https://github.com/ruby/ruby/pull/4960" class="external">https://github.com/ruby/ruby/pull/4960</a></p>
<p>The issue is unlikely to be specific to the absence operator, I think it affects any case where a regexp takes a long time due to backtracking. In addition to allowing interrupts, the pull request also allows yielding to other threads during a long regexp match (since checking for interrupts has that effect).</p> Ruby master - Bug #14103: Regexp absense operator has no chance to ^Chttps://redmine.ruby-lang.org/issues/14103?journal_id=967712022-03-10T19:07:00Zjeremyevans (Jeremy Evans)code@jeremyevans.net
<ul><li><strong>Status</strong> changed from <i>Open</i> to <i>Closed</i></li></ul><p>Applied in changeset <a class="changeset" title="Allow interrupting regexps that backtrack Fixes [Bug #14103] Co-authored-by: Nobuyoshi Nakada <..." href="https://redmine.ruby-lang.org/projects/ruby-master/repository/git/revisions/edc8576a65b7082597d45a694434261ec3ac0d9e">git|edc8576a65b7082597d45a694434261ec3ac0d9e</a>.</p>
<hr>
<p>Allow interrupting regexps that backtrack</p>
<p>Fixes [Bug <a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: Regexp absense operator has no chance to ^C (Closed)" href="https://redmine.ruby-lang.org/issues/14103">#14103</a>]</p>
<p>Co-authored-by: Nobuyoshi Nakada <a href="mailto:nobu@ruby-lang.org" class="email">nobu@ruby-lang.org</a></p> Ruby master - Bug #14103: Regexp absense operator has no chance to ^Chttps://redmine.ruby-lang.org/issues/14103?journal_id=970042022-03-23T17:07:15Zmame (Yusuke Endoh)mame@ruby-lang.org
<ul><li><strong>Status</strong> changed from <i>Closed</i> to <i>Open</i></li></ul><p>This change degrades the performance of regular expression matching when frequent backtracking occurs.</p>
<p>Before edc8576a65b7082597d45a694434261ec3ac0d9e</p>
<pre><code>$ time ./miniruby -ve '/^a*b?a*$/ =~ "a" * 20000 + "x"'
ruby 3.2.0dev (2022-03-10T19:06:33Z master edc8576a65) [x86_64-linux]
real 0m3.824s
user 0m3.820s
sys 0m0.004s
</code></pre>
<p>After edc8576a65b7082597d45a694434261ec3ac0d9e</p>
<pre><code>$ time ./miniruby -ve '/^a*b?a*$/ =~ "a" * 20000 + "x"'
ruby 3.2.0dev (2022-03-10T19:06:33Z master edc8576a65) [x86_64-linux]
real 0m4.608s
user 0m4.588s
sys 0m0.016s
</code></pre>
<p>I have no idea if this may lead to any actual problem, but how about reducing the frequency of rb_thread_check_ints? This PR makes the check only once every 128 backtracks.</p>
<p><a href="https://github.com/ruby/ruby/pull/5697" class="external">https://github.com/ruby/ruby/pull/5697</a></p>
<p>This restores the original performance.</p>
<pre><code>$ time ./miniruby -ve '/^a*b?a*$/ =~ "a" * 20000 + "x"'
ruby 3.2.0dev (2022-03-23T14:55:49Z master 8f1c69f27c) [x86_64-linux]
real 0m3.702s
user 0m3.696s
sys 0m0.000s
</code></pre>
<p>Still, it allows immediate interrupts.</p>
<pre><code>$ ./miniruby -e '/^a*b?a*$/ =~ "a" * 20000 + "x"'
^C-e:1:in `<main>': Interrupt
$ ./miniruby -e "/(?<x> (?<! a ) a+ ){0}
(?<y> (?~ \g<z> ) ){0}
(?<z> (?<! a ) \k<x> (?! a ) ){0}
\g<x> \g<y> \g<z>
/xo =~ (1..1024).map{|x| 'b' + 'a' * x }.join"
^C-e:5:in `<main>': Interrupt
</code></pre> Ruby master - Bug #14103: Regexp absense operator has no chance to ^Chttps://redmine.ruby-lang.org/issues/14103?journal_id=970092022-03-24T00:48:02Zmame (Yusuke Endoh)mame@ruby-lang.org
<ul><li><strong>Status</strong> changed from <i>Open</i> to <i>Closed</i></li></ul><p>Fixed at 9112cf4ae7f7ea8ab33c282aa02eec812421aeab.</p>