Project

General

Profile

Actions

Feature #19904

open

Deprecate or warn on multiple regular expression encodings

Added by tenderlovemaking (Aaron Patterson) 7 months ago. Updated 7 months ago.

Status:
Open
Assignee:
-
Target version:
-
[ruby-core:114911]

Description

It seems like you can pass multiple encoding flags to regular expression literals, but I think this should be a warning or possibly syntax error.

For example:

x = /foo/nu

p x.encoding

n says the RE should be ASCII-8BIT, and u says it should be UTF-8. The last flag wins, so in this case the regular expression gets UTF-8 encoding. However, I think it should be a warning or even a syntax error if you specify multiple encoding options on a regular expression. It seems like a mistake if programmers specify multiple.

Thanks!

Updated by mame (Yusuke Endoh) 7 months ago

I think it's a good idea, but am curious as to what (if any) led you to want to prohibit this.
Did you get in trouble because of this? Or did you just notice it (while implementing Prism or something)?

Updated by tenderlovemaking (Aaron Patterson) 7 months ago

mame (Yusuke Endoh) wrote in #note-1:

I think it's a good idea, but am curious as to what (if any) led you to want to prohibit this.
Did you get in trouble because of this? Or did you just notice it (while implementing Prism or something)?

No, it didn't cause any trouble. @eileencodes (Eileen Uchitelle) and I just noticed this while implementing regular expression support with Prism.

Updated by nobu (Nobuyoshi Nakada) 7 months ago

diff --git a/parse.y b/parse.y
index 3b513d3ade8..278e7eff21b 100644
--- a/parse.y
+++ b/parse.y
@@ -8032,6 +8032,9 @@ regx_options(struct parser_params *p)
         else if (rb_char_to_option_kcode(c, &opt, &kc)) {
             if (kc >= 0) {
                 if (kc != rb_ascii8bit_encindex()) kcode = c;
+                if (kopt) {
+                    rb_warn0("multiple encoding options, ignored preceding");
+                }
                 kopt = opt;
             }
             else {

Updated by nobu (Nobuyoshi Nakada) 7 months ago

lol

  it "selects last of multiple encoding specifiers" do
    /foo/ensuensuens.should == /foo/s
  end
Actions

Also available in: Atom PDF

Like2
Like0Like2Like0Like0Like0