Bug #5002
closedRipper fails to distinguish local vars from vcalls [PATCH]
Description
Ripper always parses the variable
grammar production (which includes identifiers, {i,c,g}vars, nil, FILE, etc) as a var_ref
node, whose only child is the token itself.
This is a problem for one huge reason: local variables look exactly like vcalls: no-arg, no-receiver method calls. More importantly, the parse tree defines whether a given bareword identifier is a local variable reference or a method call. Thus, given a ripper parse tree, in order to distinguish local variable references from vcalls, one must reconstruct the parse order, re-implement the local variable introduction rules (local variable assigned in some way, for loops, block arg, rescue exception variable, named regex capture groups, ....), and then relabel those var_ref
nodes which are method calls as vcall
nodes.
This is quite a nasty workaround. There are a lot of edge cases to mess up. I've implemented it as the ripper-plus
gem, but it's a huge pain, I'm not sure it's entirely correct, and is something the parser should be doing anyway.
The funny thing is, the parser already is doing almost all of the work! It's just not looking at the local variable tables when it comes time to generate the Ripper event. The patch I've attached does do so - it's a small change for a huge benefit for Ripper users.
I'd like to see this land in 1.9.3 - it's a small patch, and given the other bug fixes Ripper's had this cycle, would make Ripper pretty much sufficient for an entire Ruby implementation.
Files
Updated by nobu (Nobuyoshi Nakada) over 13 years ago
- ruby -v changed from ruby 1.9.3dev (2011-07-09 trunk 32466) [x86_64-darwin10.8.0] to -
Hi,
At Sat, 9 Jul 2011 12:24:04 +0900,
Michael Edgar wrote in [ruby-core:37908]:
The funny thing is, the parser already is doing almost all
of the work! It's just not looking at the local variable
tables when it comes time to generate the Ripper event. The
patch I've attached does do so - it's a small change for a
huge benefit for Ripper users.
I don't think 'self', 'nil', 'true', and so on are vcall. Of
course they are neither really variables, but a kind of it
syntactically. Reviewing from this point, your patch seems
wrong about use of get_id(), and I suspect it might need to
split the "variable" rule.
--
Nobu Nakada
Updated by adgar (Michael Edgar) over 13 years ago
- File vcall.breaking.diff vcall.breaking.diff added
- File vcall.same_errors.diff vcall.same_errors.diff added
Ack - I missed how nil/self would be caught as vcalls there.
As you note, splitting the variable node is necessary. I split 'variable' into 'user_variable' and 'keyword_variable', removing 'variable' entirely (since it would give r/r conflicts). However, this turns out to be a good thing overall: every other use of the variable
production is on the LHS (or LHS-like constructs, like rescue Foo => exc), which bar the use of keywords anyway through manual checking.
So, I have two patches: one which just replaces all other uses of variable
with user_variable
. This is "vcall.breaking.diff", as it breaks some previous error-case behavior. The allowed syntax is identical; the error reported on invalid syntax (self = 'foo'), is not. It will, in this patch, simply emit a generic bison "unexpected ..." error, as the grammar will not match such constructs.
The second patch adds back in keyword_variable
productions, so that error reporting will be the same, even though the new productions will always generate errors. This is "vcall.same_errors.diff". For a patch release, I imagine maintaining the errors may be preferable, so I've included it for completeness.
Updated by nobu (Nobuyoshi Nakada) over 13 years ago
Hi,
At Sun, 10 Jul 2011 04:55:07 +0900,
Michael Edgar wrote in [ruby-core:37931]:
As you note, splitting the variable node is necessary. I
split 'variable' into 'user_variable' and 'keyword_variable',
removing 'variable' entirely (since it would give r/r
conflicts). However, this turns out to be a good thing
overall: every other use of thevariable
production is on
the LHS (or LHS-like constructs, like rescue Foo
Updated by kosaki (Motohiro KOSAKI) over 13 years ago
- Category set to core
- Status changed from Open to Assigned
- Assignee set to nobu (Nobuyoshi Nakada)
Updated by nobu (Nobuyoshi Nakada) over 13 years ago
- Status changed from Assigned to Closed
- % Done changed from 0 to 100
This issue was solved with changeset r32498.
Michael, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.
- parse.y (var_ref): distinguish vcall from local variable
references. based on a patch by Michael Edgar michael.j.edgar
AT dartmouth.edu. Bug #5002