Feature #19013
closedError Tolerant Parser
Description
Background¶
Implementation for Language Server Protocol (LSP) sometimes needs to parse incomplete ruby script for example users want to complement expressions in the middle of statement like below:
class A
def m
a = 10
if # here users want to run completion
end
end
In such case, LSP implementation wants to get partial AST instead of syntax error.
Proposal¶
At the moment I want to propose 3 types of tolerance
1. Complement end
when lexer hits to end-of-input but end
is not enough¶
This is a case. Lexer will generate 1 end
before generates end-of-input.
describe "1" do
describe "2" do
describe "3" do
it "here" do
end
end
end
2. Extract "end" as keyword not identifier based on an indent¶
This is a case. Normal parser recognizes "end" on line 4 as "local variable or method".
This causes not only syntax error but also bar
method definition is assumed as Z::Foo#bar
.
Other approach is suppress !IS_lex_state(EXPR_DOT)
checks for "end".
module Z
class Foo
foo.
end
def bar
end
end
3. Change locations of error
Currently error
is put into top_stmts
and stmts
like top_stmts: error top_stmt
and stmts: error stmt
.
However these are too strict to catch syntax error then want to move it to stmt: error
and expr_value: error
.
Interface¶
- Adding
error_tolerant
option toRubyVM::AbstractSyntaxTree.parse
- Adding
--error-tolerant-parser
option to ruby command for debugging- This option is valid only when
–dump=yydebug
,--dump=parsetree
or--dump=parsetree_with_comment
is passed
- This option is valid only when
Compatibility¶
Changing the location of error
can lead incompatibility. At least I observed 2 test cases in ruby/ruby are broken by this change.
I think both of them depend on how ripper behaves after ripper raises syntax error.
- RDoc: https://github.com/yui-knk/ruby/commit/1dabbe508f0cc3dd4f83aa72502bbf347029dd8c
- However ruby script in heredoc is invalid...
- irb: https://github.com/yui-knk/ruby/commit/e18be19ecd044eb26a56f6f9ba4f19d40c01a9c7
- Range of error coloring is changed
All other changes are related to not parser but lexer and they are controlled by error_tolerant
option. Therefore no behavior change is expected for ruby parser and ripper.
Implementation¶
https://github.com/yui-knk/ruby/tree/error_recovery_indent_aware
Updated by duerst (Martin Dürst) about 2 years ago
The topic of parsing incomplete syntax also came up in Kevin Newton's talk (see https://rubykaigi.org/2022/presentations/kddnewton.html) at RubyKaigi 2022. In the talk, he said he is working on a new parser. Maybe these efforts could be combined?
Updated by matz (Yukihiro Matsumoto) about 2 years ago
Kevin's work has broader goals, e.g. being faster, consuming less memory, which should be free from yacc/bison limitation.
I consider this work as an experiment to explore error-tolerant-ness.
Matz.
Updated by yui-knk (Kaneko Yuichiro) about 2 years ago
- Status changed from Open to Closed
Applied in changeset git|fbbdbdd8911ffb24d98bb71c7c33d24609ce7dfe.
Add error_tolerant option to RubyVM::AST
If this option is enabled, SyntaxError is not raised and Node is
returned even if passed script is broken.
[Feature #19013]