Feature #19904
open
Deprecate or warn on multiple regular expression encodings
Added by tenderlovemaking (Aaron Patterson) about 1 year ago.
Updated about 1 year ago.
Description
It seems like you can pass multiple encoding flags to regular expression literals, but I think this should be a warning or possibly syntax error.
For example:
x = /foo/nu
p x.encoding
n
says the RE should be ASCII-8BIT, and u
says it should be UTF-8. The last flag wins, so in this case the regular expression gets UTF-8 encoding. However, I think it should be a warning or even a syntax error if you specify multiple encoding options on a regular expression. It seems like a mistake if programmers specify multiple.
Thanks!
I think it's a good idea, but am curious as to what (if any) led you to want to prohibit this.
Did you get in trouble because of this? Or did you just notice it (while implementing Prism or something)?
mame (Yusuke Endoh) wrote in #note-1:
I think it's a good idea, but am curious as to what (if any) led you to want to prohibit this.
Did you get in trouble because of this? Or did you just notice it (while implementing Prism or something)?
No, it didn't cause any trouble. @eileencodes (Eileen Uchitelle) and I just noticed this while implementing regular expression support with Prism.
diff --git a/parse.y b/parse.y
index 3b513d3ade8..278e7eff21b 100644
--- a/parse.y
+++ b/parse.y
@@ -8032,6 +8032,9 @@ regx_options(struct parser_params *p)
else if (rb_char_to_option_kcode(c, &opt, &kc)) {
if (kc >= 0) {
if (kc != rb_ascii8bit_encindex()) kcode = c;
+ if (kopt) {
+ rb_warn0("multiple encoding options, ignored preceding");
+ }
kopt = opt;
}
else {
lol
it "selects last of multiple encoding specifiers" do
/foo/ensuensuens.should == /foo/s
end
Also available in: Atom
PDF
Like2
Like0Like2Like0Like0Like0