Bug #18797
closedThird argument to Regexp.new is a bit broken
Description
Situation¶
'n' or 'N' can be passed as a third argument to Regexp.new
. However, the behavior is not the same as the literal n
-flag or the Regexp::NOENCODING
option, and it makes the #encoding
of Regexp
and Regexp#source
diverge:
/๐
/n # => SyntaxError
Regexp.new('๐
', Regexp::NOENCODING) # => RegexpError
re = Regexp.new('๐
', nil, 'n') # => /๐
/
re.options == Regexp::NOENCODING # => true
re.encoding # => ASCII-8BIT
re.source.encoding # => UTF-8
re =~ '๐
' # => Encoding::CompatibilityError
Code¶
Here. There is also a test for the resulting encoding here, but it is a no-op because the whole file is set to that encoding via magic comment anyway.
The third argument was added when ASCII was still the default Ruby encoding, so I guess Regexp and source encoding still matched at that point.
Solution¶
It could be fixed, but my impression is that it is not useful anymore.
It was probably only added because Regexp::NOENCODING
wasn't available at the time, so I think it could be deprecated like so:
Passing a third argument to Regexp.new is deprecated. Use
Regexp::NOENCODING
as second argument instead.