Actions

Copy link

Feature #19013

closed

Error Tolerant Parser

Added by yui-knk (Kaneko Yuichiro) almost 3 years ago. Updated almost 3 years ago.

Status:

Closed

Assignee:

Target version:

[ruby-core:109977]

Description

Background¶

Implementation for Language Server Protocol (LSP) sometimes needs to parse incomplete ruby script for example users want to complement expressions in the middle of statement like below:

class A
  def m
    a = 10
    if # here users want to run completion
  end
end

In such case, LSP implementation wants to get partial AST instead of syntax error.

Proposal¶

At the moment I want to propose 3 types of tolerance

1. Complement `end` when lexer hits to end-of-input but `end` is not enough¶

This is a case. Lexer will generate 1 end before generates end-of-input.

describe "1" do
  describe "2" do
    describe "3" do
      it "here" do
    end
  end
end

2. Extract "end" as keyword not identifier based on an indent¶

This is a case. Normal parser recognizes "end" on line 4 as "local variable or method".
This causes not only syntax error but also bar method definition is assumed as Z::Foo#bar.
Other approach is suppress !IS_lex_state(EXPR_DOT) checks for "end".

module Z
  class Foo
    foo.
  end

  def bar
  end
end

3. Change locations of `error` ¶

Currently error is put into top_stmts and stmts like top_stmts: error top_stmt and stmts: error stmt.
However these are too strict to catch syntax error then want to move it to stmt: error and expr_value: error.

Interface¶

Adding error_tolerant option to RubyVM::AbstractSyntaxTree.parse
Adding --error-tolerant-parser option to ruby command for debugging
- This option is valid only when –dump=yydebug, --dump=parsetree or --dump=parsetree_with_comment is passed

Compatibility¶

Changing the location of error can lead incompatibility. At least I observed 2 test cases in ruby/ruby are broken by this change.
I think both of them depend on how ripper behaves after ripper raises syntax error.

RDoc: https://github.com/yui-knk/ruby/commit/1dabbe508f0cc3dd4f83aa72502bbf347029dd8c
- However ruby script in heredoc is invalid...
irb: https://github.com/yui-knk/ruby/commit/e18be19ecd044eb26a56f6f9ba4f19d40c01a9c7
- Range of error coloring is changed

All other changes are related to not parser but lexer and they are controlled by error_tolerant option. Therefore no behavior change is expected for ruby parser and ripper.

Implementation¶

https://github.com/yui-knk/ruby/tree/error_recovery_indent_aware

Actions

Copy link

#1 [ruby-core:109984]

Updated by duerst (Martin Dürst) almost 3 years ago

The topic of parsing incomplete syntax also came up in Kevin Newton's talk (see https://rubykaigi.org/2022/presentations/kddnewton.html) at RubyKaigi 2022. In the talk, he said he is working on a new parser. Maybe these efforts could be combined?

Actions

Copy link

#2 [ruby-core:110003]

Updated by matz (Yukihiro Matsumoto) almost 3 years ago

Kevin's work has broader goals, e.g. being faster, consuming less memory, which should be free from yacc/bison limitation.
I consider this work as an experiment to explore error-tolerant-ness.

Matz.

Actions

Copy link

Updated by yui-knk (Kaneko Yuichiro) almost 3 years ago

Status changed from Open to Closed

Applied in changeset git|fbbdbdd8911ffb24d98bb71c7c33d24609ce7dfe.

Add error_tolerant option to RubyVM::AST

If this option is enabled, SyntaxError is not raised and Node is
returned even if passed script is broken.

[Feature #19013]

Actions

Copy link

Also available in: Atom PDF

Like0

Like0Like0Like0

Project

General

Profile

Ruby

Tags

Custom queries

Feature #19013

Error Tolerant Parser

Background¶

Proposal¶

1. Complement `end` when lexer hits to end-of-input but `end` is not enough¶

2. Extract "end" as keyword not identifier based on an indent¶

3. Change locations of `error` ¶

Interface¶

Compatibility¶

Implementation¶

Updated by duerst (Martin Dürst) almost 3 years ago

Updated by matz (Yukihiro Matsumoto) almost 3 years ago

Updated by yui-knk (Kaneko Yuichiro) almost 3 years ago

Project

General

Profile

Ruby

Tags

Custom queries

Feature #19013

Error Tolerant Parser

Background¶

Proposal¶

1. Complement end when lexer hits to end-of-input but end is not enough¶

2. Extract "end" as keyword not identifier based on an indent¶

3. Change locations of error ¶

Interface¶

Compatibility¶

Implementation¶

Updated by duerst (Martin Dürst) almost 3 years ago

Updated by matz (Yukihiro Matsumoto) almost 3 years ago

Updated by yui-knk (Kaneko Yuichiro) almost 3 years ago

1. Complement `end` when lexer hits to end-of-input but `end` is not enough¶

3. Change locations of `error` ¶