Bug #16819
openLine reporting off by one when reporting line of a hash?
Description
If I run this program:
TracePoint.new(:line) { |t| p t.lineno}.enable
def foo(a, b) # 2
a + b # 3
end # 4
# 5
foo 1, 2 # 6
# 7
A = { # 8
a: 1, # 9
b: 2 # 10
} # 11
I see:
system ~/work/jruby no_sourceposition * 2388% mri26 ../snippets/ast1.rb
2
6
3
9
I believe this 9 should be an 8 (it is what we currently emit for JRuby). I tried to figure out why this is the case and I patched RubyVM::AbstractSyntaxTree with the ability to note newline flag:
diff --git a/ast.c b/ast.c
index f0e8dd2eaf..df58006a96 100644
--- a/ast.c
+++ b/ast.c
@@ -7,6 +7,8 @@
#include "vm_core.h"
#include "iseq.h"
+#define RBOOL(v) ((v) ? Qtrue : Qfalse)
+
static VALUE rb_mAST;
static VALUE rb_cNode;
@@ -731,6 +733,16 @@ rb_ast_node_inspect(VALUE self)
return str;
}
+static VALUE
+rb_ast_node_newline(VALUE self)
+{
+ struct ASTNodeData *data;
+ TypedData_Get_Struct(self, struct ASTNodeData, &rb_node_type, data);
+
+ return RBOOL(data->node->flags & NODE_FL_NEWLINE);
+}
+
+
void
Init_ast(void)
{
@@ -756,5 +768,6 @@ Init_ast(void)
rb_define_method(rb_cNode, "last_lineno", rb_ast_node_last_lineno, 0);
rb_define_method(rb_cNode, "last_column", rb_ast_node_last_column, 0);
rb_define_method(rb_cNode, "children", rb_ast_node_children, 0);
+ rb_define_method(rb_cNode, "newline?", rb_ast_node_newline, 0);
rb_define_method(rb_cNode, "inspect", rb_ast_node_inspect, 0);
}
I also made a simple script:
source = File.read ARGV.shift
root = RubyVM::AbstractSyntaxTree.parse source
def print_node(node, indent = "")
if node.respond_to? :first_lineno
eol = node.newline? ? " <-- newline" : ""
$stdout.write "#{indent}(#{node.type}@#{node.first_lineno-1}-#{node.last_lineno-1})"
case node.type
when :LIT, :STR
puts " = #{node.children[0].inspect}#{eol}"
when :ARRAY
puts eol
node.children[0..-2].each do |child|
print_node(child, indent + " ")
end
when :FCALL
puts " = #{node.children[0]}#{eol}"
node.children[1..-1].each do |child|
print_node(child, indent + " ")
end
else
puts eol
node.children.each do |child|
print_node(child, indent + " ")
end
end
elsif node.nil?
puts "#{indent}nil"
else
puts "#{indent}#{node.inspect}"
end
end
print_node root
puts RubyVM::InstructionSequence.compile(source).disasm
If I run this I see MRI has line 8 marked as the newline (which JRuby also matches) but if we look at the disasm it would appear compile.c decided to put the line in a different location:
../ruby/ruby --disable-gems ../snippets/ast_mri.rb ../snippets/ast1.rb
(SCOPE@0-10)
[]
nil
(BLOCK@0-10)
(CALL@0-0) <-- newline
(ITER@0-0)
(CALL@0-0)
(CONST@0-0)
:TracePoint
:new
(ARRAY@0-0)
(LIT@0-0) = :line
(SCOPE@0-0)
[:t]
(ARGS@0-0)
1
nil
nil
nil
0
nil
nil
nil
nil
nil
(FCALL@0-0) = p <-- newline
(ARRAY@0-0)
(CALL@0-0)
(DVAR@0-0)
:t
:lineno
nil
:enable
nil
(DEFN@1-3) <-- newline
:foo
(SCOPE@1-3)
[:a, :b]
(ARGS@1-1)
2
nil
nil
nil
0
nil
nil
nil
nil
nil
(OPCALL@2-2) <-- newline
(LVAR@2-2)
:a
:+
(ARRAY@2-2)
(LVAR@2-2)
:b
(FCALL@5-5) = foo <-- newline
(ARRAY@5-5)
(LIT@5-5) = 1
(LIT@5-5) = 2
(CDECL@7-10) <-- newline
:A
(HASH@7-10)
(ARRAY@8-9)
(LIT@8-8) = :a
(LIT@8-8) = 1
(LIT@9-9) = :b
(LIT@9-9) = 2
== disasm: #<ISeq:<compiled>@<compiled>:1 (1,0)-(11,18)> (catch: FALSE)
== catch table
| catch type: break st: 0000 ed: 0013 sp: 0000 cont: 0013
| == disasm: #<ISeq:block in <compiled>@<compiled>:1 (1,22)-(1,39)> (catch: FALSE)
| == catch table
| | catch type: redo st: 0001 ed: 0010 sp: 0000 cont: 0001
| | catch type: next st: 0001 ed: 0010 sp: 0000 cont: 0010
| |------------------------------------------------------------------------
| local table (size: 1, argc: 1 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1])
| [ 1] t@0<Arg>
| 0000 nop ( 1)[Bc]
| 0001 putself [Li]
| 0002 getlocal_WC_0 t@0
| 0004 opt_send_without_block <callinfo!mid:lineno, argc:0, ARGS_SIMPLE>, <callcache>
| 0007 opt_send_without_block <callinfo!mid:p, argc:1, FCALL|ARGS_SIMPLE>, <callcache>
| 0010 nop
| 0011 leave ( 1)[Br]
|------------------------------------------------------------------------
0000 opt_getinlinecache 7, <is:0> ( 1)[Li]
0003 getconstant :TracePoint
0005 opt_setinlinecache <is:0>
0007 putobject :line
0009 send <callinfo!mid:new, argc:1>, <callcache>, block in <compiled>
0013 nop
0014 opt_send_without_block <callinfo!mid:enable, argc:0, ARGS_SIMPLE>, <callcache>( 1)
0017 pop
0018 putspecialobject 1 ( 2)[Li]
0020 putobject :foo
0022 putiseq foo
0024 opt_send_without_block <callinfo!mid:core#define_method, argc:2, ARGS_SIMPLE>, <callcache>
0027 pop
0028 putself ( 6)[Li]
0029 putobject_INT2FIX_1_
0030 putobject 2
0032 opt_send_without_block <callinfo!mid:foo, argc:2, FCALL|ARGS_SIMPLE>, <callcache>
0035 pop
0036 duphash {:a=>1, :b=>2} ( 9)[Li]
0038 dup ( 8)
0039 putspecialobject 3
0041 setconstant :A
0043 leave
== disasm: #<ISeq:foo@<compiled>:2 (2,0)-(4,3)> (catch: FALSE)
local table (size: 2, argc: 2 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1])
[ 2] a@0<Arg> [ 1] b@1<Arg>
0000 getlocal_WC_0 a@0 ( 3)[LiCa]
0002 getlocal_WC_0 b@1
0004 opt_plus <callinfo!mid:+, argc:1, ARGS_SIMPLE>, <callcache>
0007 leave ( 4)[Re]
Updated by ko1 (Koichi Sasada) over 4 years ago
- Status changed from Open to Assigned
- Assignee set to ko1 (Koichi Sasada)
Yes, it is intentional behavior because the first instruction of the A = { ... }
expression is line 9 (a: 1
).
Each line number attribute is attached to the instruction and there is no instruction at the line 8 for the performance, from Ruby 2.5 [Feature #14104].
I agree it is not good result.
Before Ruby 2.5, we inserted trace
instruction and this instruction has "line 8" attribute just before line 9.
However, the trace
instructions are almost nop and we removed it.
I wonder there is a good way to solve it.
Updated by enebo (Thomas Enebo) over 4 years ago
"Before Ruby 2.5, we inserted trace instruction and this instruction has "line 8" attribute just before line 9.
However, the trace instructions are almost nop and we removed it."
You mean that because you had two line instrs in a row with nothing between them you ended up eliminating it as unneeded work. We actually do a similar thing but I guess we somehow put an instr between line 8 and line 9 instr. Putting an instr between the two line instr which will be noop but somehow defeat that optimization maybe?
Updated by ko1 (Koichi Sasada) over 4 years ago
You mean that because you had two line instrs in a row with nothing between them you ended up eliminating it as unneeded work. We actually do a similar thing but I guess we somehow put an instr between line 8 and line 9 instr. Putting an instr between the two line instr which will be noop but somehow defeat that optimization maybe?
We changed the strategy: instead of inserting trace
instructions, but replace instructions with trace- prefix instructions if needed. Inserting nop (as old trace instruction) is one idea (which will be trace_nop insn if needed).