Bug #20504: Interpolated string literal in regexp encoding handling - Ruby - Ruby Issue Tracking System

Actions

Copy link

Bug #20504

closed

Interpolated string literal in regexp encoding handling

Bug #20504: Interpolated string literal in regexp encoding handling

Added by kddnewton (Kevin Newton) about 2 years ago. Updated over 1 year ago.

Status:

Closed

Assignee:

Target version:

ruby -v:

Backport:

3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN

[ruby-core:117990]

Description

There is some very odd behavior that I'm not sure is intentional or not, so I'm looking for guidance. In here:

# encoding: us-ascii

interp = "\x80"
regexp = /#{interp}/

the regexp variable is a ascii-8bit regular expression with the byte interpolated into the middle. However, if you inline that interpolation:

# encoding: us-ascii

regexp = /#{"\x80"}/

you get a syntax error, saying it's an invalid multi-byte character. I'm not sure what the rule is here, as it seems inconsistent. Is this the correct behavior?

I would prefer if it would create an ascii-8bit regular expression like the first example, which would be consistent.

Updated by Eregon (Benoit Daloze) about 2 years ago Actions
Copy link
#1 [ruby-core:118014]

Agreed, the current behavior breaks referential transparency and unexpectedly analyzes string literals inside interpolated parts.
This leads to extra confusion and I would think has no value in real-world usages of interpolated regexps (because it causes an error instead of none).

So I think this is a bug and the implementation should not analyze those parts and consequently the behavior should be the same as with the extra local variable.

Updated by Eregon (Benoit Daloze) about 2 years ago Actions
Copy link
#2

Tracker changed from Misc to Bug
Backport set to 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN

Updated by kddnewton (Kevin Newton) about 2 years ago Actions
Copy link
#3 [ruby-core:118043]

I'm fine with it analyzing the string literals, I would just prefer it take the same codepath as the interpolated variable case, in which it would produce an ascii-8bit regular expression as opposed to raising an error.

Updated by mame (Yusuke Endoh) about 2 years ago Actions
Copy link
#4 [ruby-core:118199]

Discussed at the dev meeting, and @matz (Yukihiro Matsumoto) said /#{"\x80"}/ should not raise a SyntaxError but return a binary encoded regexp object.

Updated by nobu (Nobuyoshi Nakada) over 1 year ago Actions
Copy link
#5

Status changed from Open to Closed

Applied in changeset git|6bbb470dc77a671c67411a5d3a2564bd0a665a9c.

[Bug #20504] Move dynamic regexp concatenation to iseq compiler

Actions

Copy link

Also available in: PDF Atom

Project

General

Profile

Ruby

Custom queries

Bug #20504

Interpolated string literal in regexp encoding handling

Updated by Eregon (Benoit Daloze) about 2 years ago Actions
Copy link
#1 [ruby-core:118014]

Updated by Eregon (Benoit Daloze) about 2 years ago Actions
Copy link
#2

Updated by kddnewton (Kevin Newton) about 2 years ago Actions
Copy link
#3 [ruby-core:118043]

Updated by mame (Yusuke Endoh) about 2 years ago Actions
Copy link
#4 [ruby-core:118199]

Updated by nobu (Nobuyoshi Nakada) over 1 year ago Actions
Copy link
#5

Project

General

Profile

Ruby

Custom queries

Bug #20504

Interpolated string literal in regexp encoding handling

Updated by Eregon (Benoit Daloze) about 2 years ago ActionsCopy link #1 [ruby-core:118014]

Updated by Eregon (Benoit Daloze) about 2 years ago ActionsCopy link #2

Updated by kddnewton (Kevin Newton) about 2 years ago ActionsCopy link #3 [ruby-core:118043]

Updated by mame (Yusuke Endoh) about 2 years ago ActionsCopy link #4 [ruby-core:118199]

Updated by nobu (Nobuyoshi Nakada) over 1 year ago ActionsCopy link #5

Updated by Eregon (Benoit Daloze) about 2 years ago Actions
Copy link
#1 [ruby-core:118014]

Updated by Eregon (Benoit Daloze) about 2 years ago Actions
Copy link
#2

Updated by kddnewton (Kevin Newton) about 2 years ago Actions
Copy link
#3 [ruby-core:118043]

Updated by mame (Yusuke Endoh) about 2 years ago Actions
Copy link
#4 [ruby-core:118199]

Updated by nobu (Nobuyoshi Nakada) over 1 year ago Actions
Copy link
#5