Feature #19013
closedError Tolerant Parser
Description
Background¶
Implementation for Language Server Protocol (LSP) sometimes needs to parse incomplete ruby script for example users want to complement expressions in the middle of statement like below:
class A
def m
a = 10
if # here users want to run completion
end
end
In such case, LSP implementation wants to get partial AST instead of syntax error.
Proposal¶
At the moment I want to propose 3 types of tolerance
1. Complement end when lexer hits to end-of-input but end is not enough¶
This is a case. Lexer will generate 1 end before generates end-of-input.
describe "1" do
describe "2" do
describe "3" do
it "here" do
end
end
end
2. Extract "end" as keyword not identifier based on an indent¶
This is a case. Normal parser recognizes "end" on line 4 as "local variable or method".
This causes not only syntax error but also bar method definition is assumed as Z::Foo#bar.
Other approach is suppress !IS_lex_state(EXPR_DOT) checks for "end".
module Z
class Foo
foo.
end
def bar
end
end
3. Change locations of error
¶
Currently error is put into top_stmts and stmts like top_stmts: error top_stmt and stmts: error stmt.
However these are too strict to catch syntax error then want to move it to stmt: error and expr_value: error.
Interface¶
- Adding
error_tolerantoption toRubyVM::AbstractSyntaxTree.parse - Adding
--error-tolerant-parseroption to ruby command for debugging- This option is valid only when
–dump=yydebug,--dump=parsetreeor--dump=parsetree_with_commentis passed
- This option is valid only when
Compatibility¶
Changing the location of error can lead incompatibility. At least I observed 2 test cases in ruby/ruby are broken by this change.
I think both of them depend on how ripper behaves after ripper raises syntax error.
- RDoc: https://github.com/yui-knk/ruby/commit/1dabbe508f0cc3dd4f83aa72502bbf347029dd8c
- However ruby script in heredoc is invalid...
- irb: https://github.com/yui-knk/ruby/commit/e18be19ecd044eb26a56f6f9ba4f19d40c01a9c7
- Range of error coloring is changed
All other changes are related to not parser but lexer and they are controlled by error_tolerant option. Therefore no behavior change is expected for ruby parser and ripper.
Implementation¶
https://github.com/yui-knk/ruby/tree/error_recovery_indent_aware
Updated by duerst (Martin Dürst) about 3 years ago
The topic of parsing incomplete syntax also came up in Kevin Newton's talk (see https://rubykaigi.org/2022/presentations/kddnewton.html) at RubyKaigi 2022. In the talk, he said he is working on a new parser. Maybe these efforts could be combined?
Updated by matz (Yukihiro Matsumoto) about 3 years ago
Kevin's work has broader goals, e.g. being faster, consuming less memory, which should be free from yacc/bison limitation.
I consider this work as an experiment to explore error-tolerant-ness.
Matz.
Updated by yui-knk (Kaneko Yuichiro) about 3 years ago
- Status changed from Open to Closed
Applied in changeset git|fbbdbdd8911ffb24d98bb71c7c33d24609ce7dfe.
Add error_tolerant option to RubyVM::AST
If this option is enabled, SyntaxError is not raised and Node is
returned even if passed script is broken.
[Feature #19013]