Bug #18801
closedDead YARV instructions produced when `branchif` is used
Description
Description¶
It seems there are unused YARV instructions produced when the snippet contains a branchif
instruction.
In the following example, the instructions numbers 0002 to 0004 won't ever be executed:
irb(main):003:0> puts RubyVM::InstructionSequence.compile("while 2+3; puts 'hi'; end").disasm
== disasm: #<ISeq:<compiled>@<compiled>:1 (1,0)-(1,25)> (catch: FALSE)
0000 jump 12 ( 1)[Li]
0002 putnil
0003 pop
0004 jump 12
0006 putself
0007 putstring "hi"
0009 opt_send_without_block <calldata!mid:puts, argc:1, FCALL|ARGS_SIMPLE>
0011 pop
0012 putobject 2
0014 putobject 3
0016 opt_plus <calldata!mid:+, argc:1, ARGS_SIMPLE>[CcCr]
0018 branchif 6
0020 putnil
0021 leave
Similarly in this example, 0006-0008 won't be executed.
irb(main):003:0> puts RubyVM::InstructionSequence.compile("x = 9; while x; puts 'hi'; end").disasm
== disasm: #<ISeq:<compiled>@<compiled>:1 (1,0)-(1,30)> (catch: FALSE)
local table (size: 1, argc: 0 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1])
[ 1] x@0
0000 putobject 9 ( 1)[Li]
0002 setlocal_WC_0 x@0
0004 jump 16
0006 putnil
0007 pop
0008 jump 16
0010 putself
0011 putstring "hi"
0013 opt_send_without_block <calldata!mid:puts, argc:1, FCALL|ARGS_SIMPLE>
0015 pop
0016 getlocal_WC_0 x@0
0018 branchif 10
0020 putnil
0021 leave
Initially we thought those instructions (i.e. putnil, pop, jump) were used when the return value of the while-loop is needed.
irb(main):012:0> puts RubyVM::InstructionSequence.compile("x = while foo; 9; end").disasm
== disasm: #<ISeq:<compiled>@<compiled>:1 (1,0)-(1,21)> (catch: FALSE)
local table (size: 1, argc: 0 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1])
[ 1] x@0
0000 jump 4 ( 1)[Li]
0002 putnil
0003 pop
0004 putself
0005 opt_send_without_block <calldata!mid:foo, argc:0, FCALL|VCALL|ARGS_SIMPLE>
0007 branchif 4
0009 putnil
0010 dup
0011 setlocal_WC_0 x@0
0013 leave
But it seems like some dead instructions (0002, 0003) in the example above still remains.
Are those instructions meant to be used for something else or is it a "bug" that it sticks around?
Perhaps it can be optimized away?
Updated by eightbitraptor (Matt V-H) over 2 years ago
wildmaples (Maple Ong) wrote:
It seems there are unused YARV instructions produced when the snippet contains a
branchif
instruction.
I'm not sure this is branchif
related. It looks like while true; end
generates the same dead instructions
❯ ruby --dump=insns -e 'while true; end'
== disasm: #<ISeq:<main>@-e:1 (1,0)-(1,15)> (catch: FALSE)
0000 jump 4 ( 1)[Li]
0002 putnil
0003 pop
0004 jump 4
0006 putnil
0007 leave
Updated by eightbitraptor (Matt V-H) over 2 years ago
I investigated these instructions and whether or not they were dead code and
could be removed.
From the outside it looks like there are two things to note:
- they immediately follow an unconditional jump and so should theoretically
never be reached, and - when executed together they are effectively a no-op (push nil onto the stack,
and then remove it)
However, this isn't the whole picture. The summary is that they currently are
necessary for the correct operation of the next
keyword and cannot be removed.
A jump target (next_catch_label
) is inserted in between the putnil
and the
pop
. I did not investigate thorougly enough to fully understand how next
works and what the significance of the putnil
is, however all of my naive
attempts to remove these instructions caused either memory errors, or test
failures when working with loops.
It's possible to view the instructions generated, with the labels still in place
using RubyVM::InstructionSequence
directly:
❯ ruby -e "RubyVM::InstructionSequence.compile('(while rand < 0.9; class C; next 2; end; end)').to_a.last.each { |e| puts e.inspect }"
1
:RUBY_EVENT_LINE
[:jump, :label_14]
[:putnil]
:label_3
[:pop]
[:jump, :label_14]
The instructions in question are generated by the compile_loop
function in
compile.c
, which is executed when a NODE_UNTIL
or NODE_WHILE
tree node is
compiled.
ADD_LABEL(ret, adjust_label);
ADD_INSN(ret, line_node, putnil);
ADD_LABEL(ret, next_catch_label);
ADD_INSN(ret, line_node, pop);
ADD_INSNL(ret, line_node, jump, next_label);
If we comment out the ADD_INSN
lines for putnil
and pop
, leaving in the
jump targets next_catch_label
and next_label
, then one of the flow control
tests in bootstraptest/test_flow.rb
will fail. That test looks like this:
assert_equal %q{4}, %q{
def m a, b
a + b
end
m(1,
(i=0; while i<2
i+=1
class C
next 2
end
end; 3)
)
}
Without these instructions this code outputs 5
instead of 4
causing an
assertion failure.
In addition, the following will cause a Segmentation Fault with miniruby, when
those instructions are not present.
./miniruby -e "while rand < 0.9; class C; next 2; end; end"
Interestingly, removing the next
keyword from our test code snippet results in
different bytecode being generated by the compile_loop
function: The
next_catch_label
only appears when the next
keyword is used, despite the
code that inserts it not being obviously conditional
With next
:
❯ ruby -e "RubyVM::InstructionSequence.compile('(while rand < 0.9; class C; next 2; end; end)').to_a.last.each { |e| puts e.inspect }"
1
:RUBY_EVENT_LINE
[:jump, :label_14]
[:putnil]
:label_3
[:pop]
[:jump, :label_14]
Without next
:
❯ ruby -e "RubyVM::InstructionSequence.compile('(while rand < 0.9; class C; end; end)').to_a.last.each { |e| puts e.inspect }"
1
:RUBY_EVENT_LINE
[:jump, :label_14]
[:putnil]
[:pop]
[:jump, :label_14]
The questions I have are:
- How does that test output 5?
- Even given the presence of the
next_catch_label
, theputnil
instruction looks
like it can never be reached. So why does removing it result in a Stack
Underflow? - How does the conditional insertion of the
next_catch_label
work? - What is the significance of
adjust_label
(inserted before theputnil
), and
why is it never visible in the instruction sequence?
Updated by eightbitraptor (Matt V-H) over 2 years ago
My questions aside. I'm not sure this is a bug, and could probably be closed?
Updated by dylants (Dylan Thacker-Smith) over 2 years ago
Both of those instructions serve a purpose, but they could also be optimized away.
next_catch_label
is referenced from the catch table. That catch table may be optimized away, which is why you are seeing the label being omitted from the disassembly when next
is removed. Similarly, the pop
that follows that label could also be optimized away when the catch table is optimized away, otherwise, it is needed to pop off the value added by the throw
instruction for a non-local next
usage (e.g. that test was using it within a class definition block rather than locally within the method).
The putnil
instruction that precedes next_catch_label
seems to only be used to adjust the stack depth for that label, to account for the value thrown by next
. The instruction isn't needed for execution, so it could be replaced by an iseq element that adds an element to the stack without being compiled to an instruction.
Note that adjust_label
is an example of how the stack depth at a label gets used, where it gets used by the the ADD_ADJUST_RESTORE macro to create an ISEQ_ELEMENT_ADJUST. The ISEQ_ELEMENT_ADJUST also doesn't end up generating any instructions, since the compiler calculates the necessary adjustment to be 0. I don't think the label would ever end up in the disassembly, since it doesn't look like it is used as a jump target.
Updated by mame (Yusuke Endoh) over 2 years ago
- Status changed from Open to Rejected
I believe this is not a bug. The current YARV compiler is a product of trial and error, and we know that sometimes it emits dead code (mainly due to peephole optimization). It may be room for improvement, but not a bug.