Bug #20807
openString#gsub fails when called from string subclass with a block passed
Description
When String#gsub
is called from a string subclass with a block, Regexp.last_match
is nil, but passed block is executed. Here is example code:
def call_gsub(str)
str.gsub(/%/) do
puts "checking #{str.class}"
puts "Special variable value: #{$&}"
puts "Regexp.last_match = #{Regexp.last_match.inspect}\n\n"
raise "Special variable $& is not assigned, but block is called" if $&.nil?
end
end
class MyString < String
def gsub(*args, &block)
super(*args, &block) # just forward everything
end
end
text = 'test%text_with_special_character'
call_gsub(String.new(text)) # original string
call_gsub(MyString.new(text)) # string subclass
Result:
checking String
Special variable value: %
Regexp.last_match = #<MatchData "%">
checking MyString
Special variable value:
Regexp.last_match = nil
gsub_bug.rb:7:in `block in call_gsub': Special variable $& is not assigned, but block is called (RuntimeError)
from gsub_bug.rb:13:in `gsub'
from gsub_bug.rb:13:in `gsub'
from gsub_bug.rb:2:in `call_gsub'
from gsub_bug.rb:20:in `<main>'
I expect result to be the same for both classes since MyString
just wraps the same method:
checking String
Special variable value: %
Regexp.last_match = #<MatchData "%">
checking MyString
Special variable value: %
Regexp.last_match = #<MatchData "%">
Maybe there is something off with with control frame when params are forwarded?
Thanks in advance!
Updated by Dan0042 (Daniel DeLorme) 3 months ago
Regexp.last_match and other regexp-related pseudo globals do not work across more than one stack frame. Since you override #gsub, they are set only inside MyString#gsub
You can confirm with this:
def test(klass)
p klass
klass.new("test").gsub(/s/,'x')
p result: $~
end
class MyString1 < String
end
test(MyString1)
#prints:
#{:result=>#<MatchData "s">}
class MyString2 < String
def gsub(...)
super
ensure
p ensure: $~
end
end
test(MyString2)
#prints:
#{:ensure=>#<MatchData "s">}
#{:result=>nil}
It would be possible to fix this by propagating Regexp.last_match up every "super" stack frame until we reach the originating non-super frame. It would allow some interesting use cases (like logging the time spent in every Regexp#match). But it's a lot of work for a very niche use.
Updated by jeremyevans0 (Jeremy Evans) 2 months ago
- Related to Bug #8444: Regexp vars $~ and friends are not thread local added
- Related to Bug #12689: Thread isolation of $~ and $_ added
- Related to Bug #14364: Regexp last match variable in procs added
Updated by jeremyevans0 (Jeremy Evans) 2 months ago
- Related to Bug #11808: Different behavior between Enumerable#grep and Array#grep added