Project

General

Profile

Actions

Bug #5002

closed

Ripper fails to distinguish local vars from vcalls [PATCH]

Added by adgar (Michael Edgar) over 13 years ago. Updated over 13 years ago.

Status:
Closed
Target version:
ruby -v:
-
Backport:
[ruby-core:37908]

Description

Ripper always parses the variable grammar production (which includes identifiers, {i,c,g}vars, nil, FILE, etc) as a var_ref node, whose only child is the token itself.

This is a problem for one huge reason: local variables look exactly like vcalls: no-arg, no-receiver method calls. More importantly, the parse tree defines whether a given bareword identifier is a local variable reference or a method call. Thus, given a ripper parse tree, in order to distinguish local variable references from vcalls, one must reconstruct the parse order, re-implement the local variable introduction rules (local variable assigned in some way, for loops, block arg, rescue exception variable, named regex capture groups, ....), and then relabel those var_ref nodes which are method calls as vcall nodes.

This is quite a nasty workaround. There are a lot of edge cases to mess up. I've implemented it as the ripper-plus gem, but it's a huge pain, I'm not sure it's entirely correct, and is something the parser should be doing anyway.

The funny thing is, the parser already is doing almost all of the work! It's just not looking at the local variable tables when it comes time to generate the Ripper event. The patch I've attached does do so - it's a small change for a huge benefit for Ripper users.

I'd like to see this land in 1.9.3 - it's a small patch, and given the other bug fixes Ripper's had this cycle, would make Ripper pretty much sufficient for an entire Ruby implementation.


Files

ripper.vcall.diff (1.64 KB) ripper.vcall.diff adgar (Michael Edgar), 07/09/2011 12:24 PM
vcall.breaking.diff (3.51 KB) vcall.breaking.diff Implementation which changes error messages for `nil = foo` adgar (Michael Edgar), 07/10/2011 04:55 AM
vcall.same_errors.diff (4.24 KB) vcall.same_errors.diff Implementation with additional grammar rules to match existing error messages adgar (Michael Edgar), 07/10/2011 04:55 AM

Updated by nobu (Nobuyoshi Nakada) over 13 years ago

  • ruby -v changed from ruby 1.9.3dev (2011-07-09 trunk 32466) [x86_64-darwin10.8.0] to -

Hi,

At Sat, 9 Jul 2011 12:24:04 +0900,
Michael Edgar wrote in [ruby-core:37908]:

The funny thing is, the parser already is doing almost all
of the work! It's just not looking at the local variable
tables when it comes time to generate the Ripper event. The
patch I've attached does do so - it's a small change for a
huge benefit for Ripper users.

I don't think 'self', 'nil', 'true', and so on are vcall. Of
course they are neither really variables, but a kind of it
syntactically. Reviewing from this point, your patch seems
wrong about use of get_id(), and I suspect it might need to
split the "variable" rule.

--
Nobu Nakada

Updated by adgar (Michael Edgar) over 13 years ago

Ack - I missed how nil/self would be caught as vcalls there.

As you note, splitting the variable node is necessary. I split 'variable' into 'user_variable' and 'keyword_variable', removing 'variable' entirely (since it would give r/r conflicts). However, this turns out to be a good thing overall: every other use of the variable production is on the LHS (or LHS-like constructs, like rescue Foo => exc), which bar the use of keywords anyway through manual checking.

So, I have two patches: one which just replaces all other uses of variable with user_variable. This is "vcall.breaking.diff", as it breaks some previous error-case behavior. The allowed syntax is identical; the error reported on invalid syntax (self = 'foo'), is not. It will, in this patch, simply emit a generic bison "unexpected ..." error, as the grammar will not match such constructs.

The second patch adds back in keyword_variable productions, so that error reporting will be the same, even though the new productions will always generate errors. This is "vcall.same_errors.diff". For a patch release, I imagine maintaining the errors may be preferable, so I've included it for completeness.

Updated by nobu (Nobuyoshi Nakada) over 13 years ago

Hi,

At Sun, 10 Jul 2011 04:55:07 +0900,
Michael Edgar wrote in [ruby-core:37931]:

As you note, splitting the variable node is necessary. I
split 'variable' into 'user_variable' and 'keyword_variable',
removing 'variable' entirely (since it would give r/r
conflicts). However, this turns out to be a good thing
overall: every other use of the variable production is on
the LHS (or LHS-like constructs, like rescue Foo

Updated by kosaki (Motohiro KOSAKI) over 13 years ago

  • Category set to core
  • Status changed from Open to Assigned
  • Assignee set to nobu (Nobuyoshi Nakada)
Actions #5

Updated by nobu (Nobuyoshi Nakada) over 13 years ago

  • Status changed from Assigned to Closed
  • % Done changed from 0 to 100

This issue was solved with changeset r32498.
Michael, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.


  • parse.y (var_ref): distinguish vcall from local variable
    references. based on a patch by Michael Edgar michael.j.edgar
    AT dartmouth.edu. Bug #5002
Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0