Project

General

Profile

Actions

Bug #21097

open

`x = a rescue b in c` and `def f = a rescue b in c` parsed differently between parse.y and prism

Added by tompng (tomoya ishida) 8 months ago. Updated 40 minutes ago.

Status:
Assigned
Target version:
-
ruby -v:
ruby 3.5.0dev (2025-01-27T08:19:32Z master c3c7300b89) +YJIT +MN +PRISM [arm64-darwin22]
[ruby-core:120819]

Description

x = a rescue b in c
(x = (a rescue b)) in c # parse.y, prism(ruby 3.4)
x = (a rescue (b in c)) # prism(ruby 3.5)
def f = a rescue b in c #=> true(parse.y), :f(prism)
(def f = (a rescue b)) in c # parse.y
def f = (a rescue (b in c)) # prism

There is no difference between prism and parse.y parsing these codes

a rescue b in c # a rescue (b in c)
x = a rescue b # x = (a rescue b)
x = b in c # (x = b) in c
def f = a rescue b # def f = (a rescue b)
def f = b in c # (def f = a) in b

Related issues 2 (0 open2 closed)

Related to Ruby - Bug #21132: Changed postposition `rescue` and `if` behavior since Ruby 3.4ClosedprismActions
Related to Ruby - Bug #21378: variable pinning does not look for method argumentsFeedbackmatz (Yukihiro Matsumoto)Actions

Updated by tompng (tomoya ishida) 8 months ago

not in and not rescue has the same problem

$ ruby --parser=parse.y -e "def f = not 1 in 2; p f"
false
$ ruby --parser=prism   -e "def f = not 1 in 2; p f"
true
$ ruby --parser=parse.y -e "def f = not a rescue true; p f"
false
$ ruby --parser=prism   -e "def f = not a rescue true; p f"
true

Updated by tenderlovemaking (Aaron Patterson) 8 months ago

  • Assignee set to prism
Actions #3

Updated by hsbt (Hiroshi SHIBATA) 7 months ago

  • Status changed from Open to Assigned
Actions #4

Updated by nobu (Nobuyoshi Nakada) 7 months ago

  • Related to Bug #21132: Changed postposition `rescue` and `if` behavior since Ruby 3.4 added

Updated by matz (Yukihiro Matsumoto) 7 months ago

The behavior of Prism in 3.5 is close to my intention.

Matz.

Updated by kddnewton (Kevin Newton) 7 months ago

In this case, I'm not sure if the assignee should be prism, if we now have the desired behavior. @tompng (tomoya ishida) does this match your understanding?

Updated by tompng (tomoya ishida) 7 months ago

  • Status changed from Assigned to Open
  • Assignee deleted (prism)

Updated by mame (Yusuke Endoh) 6 months ago

  • Status changed from Open to Assigned
  • Assignee set to nobu (Nobuyoshi Nakada)
Actions #9

Updated by yui-knk (Kaneko Yuichiro) 12 days ago

  • Related to Bug #21378: variable pinning does not look for method arguments added

Updated by yui-knk (Kaneko Yuichiro) 3 days ago

I am working on this ticket and have some questions about grammar where I'd like to ask for Matz's opinions.

Precedence of not, in, rescue

The precedence of not, in, rescue is referred on https://bugs.ruby-lang.org/issues/21097#note-1.

Q1. Is it okay to combine them as follows?

# def f = not (1 in 2)
def f = not 1 in 2

# def f = (not a) rescue true
def f = not a rescue true

# def f = (not (a in 1)) rescue true
def f = not a in 1 rescue true

# def f = (not a) rescue (1 in 1)
def f = not a rescue 1 in 1

This is consistent with how it's interpreted when written in the body of a method definition that has an end.

def f1
  # not (1 in 2)
  not 1 in 2
end

def f2
  # (not a) rescue true
  not a rescue true
end

def f3
  # (not (a in 1)) rescue true
  not a in 1 rescue true
end

def f4
  # (not a) rescue (1 in 1)
  not a rescue 1 in 1
end

By the way, the operator precedence derived from this interpretation is modifier_rescue < not < in, which differs from the in < not < modifier_rescue defined in the precedence table.

Precedence of =, in, rescue

I think this ticket is about making the precedence of =, in, and rescue consistent.
From the desire for x = a rescue b in c to be interpreted as x = (a rescue (b in c)), it's thought that the precedence of these operators should be in the order of = < modifier_rescue < in.

Q2. The following code does not follow this precedence in either parse.y or prism, so is it correct to say that it's supposed to be interpreted as follows?

x = b in c # x = (b in c)
def f = b in c # def f = (b in c )

What's happening in parse.y

In the parse.y definition, single-line pattern matching can't appear on the right-hand side of an assignment.
Furthermore, the left-hand side of single-line pattern matching (the left side of in or =>) is an arg rule, and an arg can contain an assignment.
Because of this, x = a rescue b in c is interpreted as (x = a rescue b) in c.

Try to make single-line pattern matching to be arg

In Ruby's grammar, the right-hand side of an assignment is often the arg rule. For example, the right-hand side of x = 1, x = 1 + 2, and x = obj.m are all arg.
These assignments have the unique characteristic that they can be arguments to a method. This means that m(x = 2, x) is valid code.
Currently, single-line pattern matching is an expr rule, which is not part of arg. The arg grammar element, as its name suggests, can be a method argument.
If we were to make single-line pattern matching an arg, it would cause conflicts with several tokens.

Conflict on “=>”

Let's consider the code m(v => [1]).
This can be interpreted as either a pattern matching => or a hash =>.

Conflict on “,”

Let's consider the code m(v in 1, 2, 3).
Because the brackets [] can be omitted in an array pattern, this code can be interpreted as either an array pattern v in 1, 2, 3 or as a method call to m with multiple arguments like m((v in 1), 2, 3).

Conflict on “|”

Let's consider the code m(v in :a | :b).
The | symbol is used in pattern matching to separate patterns. However, it's also used as a binary operator.
Therefore, this code can be interpreted as either a method call to m with a pattern matching argument of v in :a | :b, or as a method call to m with an argument that connects v in :a and :b with the binary operator |.

Conflict on “^”

Let's consider the code m(v in 1, ^a).
It may be a bit surprising that an ambiguity arises with the ^ used for variable pinning. However, recall that the brackets [] can be omitted in an array pattern, and that a trailing comma can be used in an array pattern.
This code can therefore be interpreted in two ways: either the v in 1, ^a part is treated as a pattern, or it's a pattern matching v in 1, (with a trailing comma) connected to a by the binary operator ^.

Should the ambiguity be resolved?

These ambiguities aren't a parser problem; they're a grammar problem.
It's possible to resolve these ambiguities by deciding on a specific interpretation for each case. For example, we could resolve the conflict with => by only making in part of the arg rule for single-line pattern matching, or we could fix the conflict with , by prohibiting the omission of brackets [] in array patterns.
However, let's consider a different approach here.

An alternative approach: Allow it only in assignments that aren't arg rule

Many assignments in Ruby are arg, but there are some exceptions.
For example, a method call where the parentheses are omitted (called a command) can be written on the right-hand side of an assignment, but that assignment cannot be used as an argument.

x = cmd 1, 2 # OK
m(x = cmd 1, 2) # Syntax Error

Let's consider an approach that allows single-line pattern matching on the right-hand side only for these kinds of assignments, which I'll temporarily call "top-level assignments." For example, the implementation would be as here.

Q3. What do you think about limiting assignments with pattern matching to the top level?

What can be written in the body of an endless method definition?

Now, various things can be written in the body of an endless method definition.

def m = 1 + 2
def m = cmd 1, 2
def m = a in b
def f = a rescue b
def f = a rescue b in c 

Let's go over a few examples of things that cannot be written in the body of an endless method definition. Since an arg can be written in the body, we will specifically look at the production rules included in stmt and expr.

# expr
def m = cmd 1, 2 do end # SyntaxError

def m = !cmd 1, 2 # SyntaxError
!cmd 1, 2 # ok
x = !cmd 1, 2 # SyntaxError

# stmt
def m = x = cmd 1, 2 # SyntaxError
x = cmd 1, 2 # ok

def m = x += cmd 1, 2 # SyntaxError
x += cmd 1, 2 # ok

def m = def m2 = obj.m 1 # SyntaxError
def m2 = obj.m 1 # ok

I will provide a simple explanation for each.
In the first code example, when we write private def m = cmd 1, 2 do end, an ambiguity arises over whether the do end block is attached to private or to cmd.
The second code example is simply a matter of whether or not to allow it. Since it's allowed when it's not an assignment but prohibited on the right-hand side of an assignment, the question is which behavior to make consistent.
For the three cases starting with stmt, a conflict occurs with and, or, and do. This is because the grammar has two competing rules: command_rhs: command_call_value and command_rhs: command_call_value modifier_rescue after_rescue stmt.

# def m = x = cmd 1, 2 rescue (a and b)
# (def m = x = cmd 1, 2 rescue a) and b
def m = x = cmd 1, 2 rescue a and b

# private def m = x = (cmd 1, 2 do exp end)
# private (def m = x = cmd 1, 2) do exp end
private def m = x = cmd 1, 2 do exp end

For example, we can resolve this by changing command_call_value to command and the stmt after rescue to an arg (simply changing stmt to expr would still leave a conflict with do...end because expr: command_call exists).

command_rhs	: command   %prec tOP_ASGN
            | command modifier_rescue after_rescue arg

Q4. What can be written in the body of an endless method definition?

Associativity of modifier rescue

When used as a postfix operator, rescue is a left-associative operator, but its behavior on the right-hand side of an assignment is slightly different.
On the right-hand side of an assignment, only one postfix rescue is associated to an expression.

# (((a rescue b) rescue c) rescue d)
a rescue b rescue c rescue d

# (x = a rescue b) rescue c rescue d
x = a rescue b rescue c rescue d

Q5. In the case of an endless method definition, it uses the same binding method as a normal (non-assignment) case. Should this be left as is?

# def m = (((a rescue b) rescue c) rescue d)
def m = a rescue b rescue c rescue d

When we previously discussed the combination of and or or with an endless method definition, a comment in a bug report ( https://bugs.ruby-lang.org/issues/19392#note-9) mentioned the analogy between an endless method definition and an assignment.
Based on this, I thought that if we follow the behavior of an assignment, it might also be possible to interpret (def m = a rescue b) rescue c rescue d.

Certain endless method definitions are arg

Because some endless method definitions are currently defined as arg, the following code is valid.

private :m, def m = 1 rescue 2
private x = def m = 1 rescue 2

However, the endless method definition in the following codes is a stmt, which causes a syntax error.

private :m, def m = 1 rescue 2 in 3
private x = def m = 1 rescue 2 in 3

This difference depends on whether pattern matching is written inside the postfix rescue, so in some cases, you can't tell the difference without reading quite far ahead.
Adding the ability to use pattern matching on the right-hand side of an assignment, defined as a stmt, can be seen as a move towards increasing the gap between assignment that can be used as arg and those that cannot.
At the same time, when the right-hand side is a command, it behaves similarly, and the command syntax has been in Ruby for a long time. So, one could say it's a characteristic of Ruby's grammar to want to include as many writable things as possible in the arg rule.
It seems that the technique of writing an assignment as an argument has become widespread among people who participate in code golfing, a sport of writing code as short as possible.

x = cmd 1, 2, 3 # ok
m(x = cmd 1, 2, 3) # SyntaxError

Q6. Should this distinction be maintained? Also, in cases where there is only one argument, like private def m = 1 rescue 2 in 3, the body can be passed as an argument even if it's not an arg.

The list of questions

  • Q1. Is it okay to combine them as follows?
  • Q2. The following code does not follow this precedence in either parse.y or prism, so is it correct to say that it's supposed to be interpreted as follows?
  • Q3. What do you think about limiting assignments with pattern matching to the top level?
  • Q4. What can be written in the body of an endless method definition?
  • Q5. In the case of an endless method definition, it uses the same binding method as a normal (non-assignment) case. Should this be left as is?
  • Q6. Should this distinction be maintained? Also, in cases where there is only one argument, like private def m = 1 rescue 2 in 3, the body can be passed as an argument even if it's not an arg.

Updated by naruse (Yui NARUSE) 2 days ago

Prism's behavior should be compatible with Ruby 3.3.
Unless a design change is accepted, it should not break a compatibility.
Could you change behaviors showed in this ticket to Ruby 3.3's behavior?

Actions #12

Updated by naruse (Yui NARUSE) 2 days ago

  • Backport changed from 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN to 3.1: DONTNEED, 3.2: DONTNEED, 3.3: DONTNEED, 3.4: REQUIRED

Updated by kddnewton (Kevin Newton) about 1 hour ago

Should this now be assigned to prism since there is an incompatibility? Sorry I am not clear on the conclusion.

Updated by alanwu (Alan Wu) 40 minutes ago

matz (Yukihiro Matsumoto) wrote in #note-5:

The behavior of Prism in 3.5 is close to my intention.

Matz.

naruse (Yui NARUSE) wrote in #note-11:

Prism's behavior should be compatible with Ruby 3.3.
Unless a design change is accepted, it should not break a compatibility.
Could you change behaviors showed in this ticket to Ruby 3.3's behavior?

Looks like the behavior for 3.5 is undecided for now. In any case, the behavior for 3.4 should be consistent with 3.3, so it'd be nice to fix Prism's behavior when parsing with 3.4 grammar.

Actions

Also available in: Atom PDF

Like1
Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0