Feature #11868
openProposal for RubyVM::InstructionSequence.compile to return an object containing the syntax error information currently written to STDERR
Description
Currently, RubyVM::InstructionSequence.compile or RubyVM::InstructionSequence.new return a new InstructionSequence for valid ruby.
For invalid syntax, a SyntaxError is raised with a message of 'compile error'. Meanwhile, the useful information, line number(s) and hint(s) to the invalid syntax location, is printed on standard error. I am proposing this information be returned as an object in the event of a SyntaxError.
For example, here's good syntax:
RubyVM::InstructionSequence.new("x =1")
# => <RubyVM::InstructionSequence:<compiled>@<compiled>>
Here's bad syntax:
RubyVM::InstructionSequence.new("puts 'hi'\n puts 'hi2'\n\nthis.is -> not -> valid $ruby:syntax")
# => SyntaxError: compile error
# The useful hint and line number(s) are on standard error:
<compiled>:4: syntax error, unexpected keyword_not, expecting keyword_do_LAMBDA or tLAMBEG
this.is -> not -> valid $ruby:syntax
^
<compiled>:4: syntax error, unexpected tGVAR, expecting keyword_do_LAMBDA or tLAMBEG
this.is -> not -> valid $ruby:syntax
^
Some ideas:
-
Add methods to all SyntaxError exceptions to get all parse failures. For example:
syntax_error.parse_failures.each {|f| puts f.lineno; puts f.hint }
. In the above example, it failed on line 4 twice and we see two "hints." -
Create a new method to RubyVM::InstructionSequence to check ruby syntax that would allow us to see if the syntax is valid and if not, the lineno and 'hint' for each parse failure.
Use case: Rubocop[a] and other utilities[b] are really complicated and check for valid ruby syntax by creating a process to run ruby -wc with the script.
[a] https://github.com/bbatsov/rubocop/blob/86e1acf67794bf6dd5d65812b91df475e44fa320/spec/support/mri_syntax_checker.rb#L51-L63
[b] https://github.com/ManageIQ/manageiq/blob/6725fe52222c07d576a18126d2ff825ddc6dffd0/gems/pending/util/miq-syntax-checker.rb#L8-L13
It would be nice to remove all of this complexity and use RubyVM::InstructionSequence, which already has the information we need but in a more user friendly format.
Thanks!
Joe Rafaniello
Updated by jrafanie (Joe Rafaniello) almost 9 years ago
Note: I also tried ripper and all methods return nil for invalid syntax so RubyVM::InstructionSequence seemed like the easiest place to do this feature since it already has the information I need.
Updated by nobu (Nobuyoshi Nakada) almost 9 years ago
assert_valid_syntax
may help you.
Updated by jrafanie (Joe Rafaniello) almost 9 years ago
Nobuyoshi Nakada wrote:
assert_valid_syntax
may help you.
Thank you for reviewing this so quickly and the suggestion. I forgot that test/unit had such an assertion.
Although, I think this assertion doesn't provide the information I need.
RubyVM::InstructionSequence.compile provides this information to standard error: the line number(s), specific error(s), and the location "hint(s)". The line number is very important for style checkers such as rubocop as it helps identify the location of the invalid syntax. The hint is also very helpful.
For example:
$ cat test.rb
class Joe
def test-
end
end
Running rubocop:
$ rubocop test.rb
Inspecting 1 file
E
Offenses:
test.rb:2:11: E: unexpected token tMINUS
def test-
^
1 file inspected, 1 offense detected
With a 4 line file, this is not that difficult but with much larger files and many changes happening, it's easy to make a mistake or typo and not "see" the problem immediately.
It would be great to have access to this useful information directly in ruby through lineno
and hint
methods (or better names) and not have to capture and parse STDERR manually.
Thank you for the consideration.
Updated by nobu (Nobuyoshi Nakada) almost 9 years ago
Updated by jrafanie (Joe Rafaniello) almost 9 years ago
Nobuyoshi Nakada wrote:
What's
hint
?https://github.com/ruby/ruby/compare/trunk...nobu:feature/11868-SyntaxError-location
Thank you for starting on this already. I appreciate it.
The hint/location suggestion is what ruby prints indicating where the syntax error occurred.
Sorry, hint is probably the wrong word.
In the example below:
<compiled>:4: syntax error, unexpected keyword_not, expecting keyword_do_LAMBDA or tLAMBEG
this.is -> not -> valid $ruby:syntax
^
<compiled>:4: syntax error, unexpected tGVAR, expecting keyword_do_LAMBDA or tLAMBEG
this.is -> not -> valid $ruby:syntax
^
There are two syntax errors.
Error 1
line number: 4
Error:
unexpected keyword_not, expecting keyword_do_LAMBDA or tLAMBEG
The "hint" or location is:
this.is -> not -> valid $ruby:syntax
^
Error 2
line number: 4
Error:
unexpected keyword_not, expecting keyword_do_LAMBDA or tLAMBEG
The "hint" or location is:
this.is -> not -> valid $ruby:syntax
^
Thank you!
Updated by nobu (Nobuyoshi Nakada) almost 9 years ago
- Has duplicate Feature #11951: `RubyVM::InstructionSequence.compile` should return the error message within the raised error added
Updated by jrafanie (Joe Rafaniello) almost 9 years ago
Hi @nobu (Nobuyoshi Nakada), thanks for working on this feature in https://github.com/ruby/ruby/compare/trunk...nobu:feature/11868-SyntaxError-location.
For this example:
RubyVM::InstructionSequence.new("puts 'hi'\n puts 'hi2'\n\nthis.is -> not -> valid $ruby:syntax")
<compiled>:4: syntax error, unexpected keyword_not, expecting keyword_do_LAMBDA or tLAMBEG
this.is -> not -> valid $ruby:syntax
^
<compiled>:4: syntax error, unexpected tGVAR, expecting keyword_do_LAMBDA or tLAMBEG
this.is -> not -> valid $ruby:syntax
^
SyntaxError: compile error
from (irb):11:in `new'
from (irb):11
from /Users/joerafaniello/.rubies/ruby-2.2.4/bin/irb:11:in `<main>'
Using your branch on github, we would get the file: 'compiled' and lineno: '4' but not the message(or hint) with the "^" indicating where the syntax error occurred.
Is it possible to get the file, lineno and message(or hint)?
This is the type of code we are trying to eliminate: (shelling out to ruby and capturing $stderr):
https://github.com/bbatsov/rubocop/blob/86e1acf67794bf6dd5d65812b91df475e44fa320/spec/support/mri_syntax_checker.rb#L51-L63
Thanks again for looking into this feature!
Joe
Updated by sawa (Tsuyoshi Sawada) almost 9 years ago
If the character position can be achieved in addition to the line number, then the "hint" can be reconstructed. (However, I think the syntax error information being bound to the raised error rather than being output to $stderr
would be a much cleaner solution.)
Updated by jrafanie (Joe Rafaniello) almost 9 years ago
I agree, it seems the best place to put this information is in the SyntaxError exception. If we need to keep compatibility by keeping the output to $stderr, I'm OK with that but I want to be able to rescue the SyntaxError and get the "file", "lineno" and the problem code (hint).
The hint or problem code could even be reconstructed exactly as Tsuyoshi Sawada says if we know the column position of the syntax error.
Tools like rubocop would most likely want exactly what is currently printed to $stderr but as methods on some object such as the exception.
Updated by nobu (Nobuyoshi Nakada) over 8 years ago
- Has duplicate deleted (Feature #11951: `RubyVM::InstructionSequence.compile` should return the error message within the raised error)
Updated by nobu (Nobuyoshi Nakada) over 8 years ago
- Related to Feature #11951: `RubyVM::InstructionSequence.compile` should return the error message within the raised error added
Updated by nobu (Nobuyoshi Nakada) over 8 years ago
Updated.
$ ./ruby -e 'begin eval("this.is -> not -> valid $ruby:syntax"); rescue SyntaxError => e; e.failures.each {|ex|p [ex.lineno, ex.column, ex.mesg]}; end'
[1, 14, "syntax error, unexpected keyword_not, expecting keyword_do_LAMBDA or tLAMBEG\nthis.is -> not -> valid $ruby:syntax\n ^"]
[1, 29, "syntax error, unexpected tGVAR, expecting keyword_do_LAMBDA or tLAMBEG\nthis.is -> not -> valid $ruby:syntax\n ^"]