misleading example in Permformance section of Regexp documentation?
In the Performance section of the Regexp documention, it is stated that changing from Regexp.new('a?' * 29 + 'a' * 29) to Regexp.new('\A' 'a?' * 29 + 'a' * 29) speeds up the match significantly. However, '\A' 'a?' * 29 actually expands to "\Aa?\Aa?....\Aa?" -- which doesn't seem like it's what you'd actually want, since [for example] Regexp.new('\A' 'a?' * 29 + 'a' * 29).match('a' * 50) will only match 30 'a' characters [the 29 from 'a' * 29, plus one single one from the set of '\Aa?' items. [That is, Regexp.new('\A' 'a?' * 29 + ...) is doesn't match any more 'a' characters than Regexp.new('\A' 'a?' + ...) would.]
One can get around this by changing the expression to Regexp.new('\A' + 'a?' * 29 + 'a' * 29), in which case additional "a" characters do get matched as the string gets longer... but with that Regexp the .match('a' * 29) still takes about the same amount of time as the original one.
So I wonder if there is some better example that could be used here?
Updated by drbrain (Eric Hodel) about 8 years ago
- Status changed from Open to Closed
- % Done changed from 0 to 100
This issue was solved with changeset r35862.
Nathan, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.
- doc/re.rdoc (Performance): Replaced incorrect example of reducing backtracking through anchoring with reduced backtracking through a range. [ruby-trunk - Bug #6525]