Feature #19904
openDeprecate or warn on multiple regular expression encodings
Description
It seems like you can pass multiple encoding flags to regular expression literals, but I think this should be a warning or possibly syntax error.
For example:
x = /foo/nu
p x.encoding
n
says the RE should be ASCII-8BIT, and u
says it should be UTF-8. The last flag wins, so in this case the regular expression gets UTF-8 encoding. However, I think it should be a warning or even a syntax error if you specify multiple encoding options on a regular expression. It seems like a mistake if programmers specify multiple.
Thanks!
Updated by mame (Yusuke Endoh) about 1 year ago
I think it's a good idea, but am curious as to what (if any) led you to want to prohibit this.
Did you get in trouble because of this? Or did you just notice it (while implementing Prism or something)?
Updated by tenderlovemaking (Aaron Patterson) about 1 year ago
mame (Yusuke Endoh) wrote in #note-1:
I think it's a good idea, but am curious as to what (if any) led you to want to prohibit this.
Did you get in trouble because of this? Or did you just notice it (while implementing Prism or something)?
No, it didn't cause any trouble. @eileencodes (Eileen Uchitelle) and I just noticed this while implementing regular expression support with Prism.
Updated by nobu (Nobuyoshi Nakada) about 1 year ago
diff --git a/parse.y b/parse.y
index 3b513d3ade8..278e7eff21b 100644
--- a/parse.y
+++ b/parse.y
@@ -8032,6 +8032,9 @@ regx_options(struct parser_params *p)
else if (rb_char_to_option_kcode(c, &opt, &kc)) {
if (kc >= 0) {
if (kc != rb_ascii8bit_encindex()) kcode = c;
+ if (kopt) {
+ rb_warn0("multiple encoding options, ignored preceding");
+ }
kopt = opt;
}
else {
Updated by nobu (Nobuyoshi Nakada) about 1 year ago
lol
it "selects last of multiple encoding specifiers" do
/foo/ensuensuens.should == /foo/s
end