Bug #14834
closedrb_profile_frames SEGV when PC adjusted on IFUNC
Description
Since r62052, we increment ec->cfp->pc
by one pointer width (e.g. 8 bytes) in gc_event_hook_body
around the EXEC_EVENT_HOOK
call.
This becomes a problem when the hook is on an IFUNC: in this case, pc == 0x0
, meaning we increment it to a non-zero value during that call.
rb_profile_frames
uses the following check to determine if frame info should be recorded:
if (cfp->iseq && cfp->pc) {
The example here is stackprof
, which calls rb_profile_frames
in a gc event hook. This will segfault currently, as the above check will pass.
calc_lineno
then attempts to calculate the line number:
size_t pos = (size_t)(pc - iseq->body->iseq_encoded);
This fails for a variety of reasons: iseq_encoded
isn't valid because iseq
isn't an rb_iseq_t
underneath, producing an essentially random value, and pc
is 0x8, so we underflow and eventually cause an overrun in succ_index_lookup
with a huge pos
argument.
We instead only adjust PC if it appears to be a valid pointer in the first place.
Files
Updated by kivikakk (Asherah Connor) over 6 years ago
It's also worth noting:
- is the increment of
pc
withec->cfp->pc++
correct? What if the instruction is multiple values wide? - there are similar
pc
increment/decrement pairs aroundEXEC_EVENT_HOOK
calls invm_trace
. Do we need to address these too?
Ping @tenderlovemaking (Aaron Patterson) as he helped me with this.
Updated by kivikakk (Asherah Connor) over 6 years ago
Further:
- is the original change correct? Some instructions have the
handle_frame?
property, which means they do increment the PC by the instruction width first. It's those that don't have that property which now increment the PC after. The gc hook caller always increments PC, assuming non-handle_frame?
instructions.
Updated by shyouhei (Shyouhei Urabe) over 6 years ago
- Status changed from Open to Assigned
- Assignee set to shyouhei (Shyouhei Urabe)
Thanks reporting! Will handle it.
Updated by kivikakk (Asherah Connor) over 6 years ago
Here's an easy way to reproduce the segfault on current trunk:
Gemfile
source "https://rubygems.org"
gem "stackprof"
test.rb
require 'stackprof'
StackProf.run(mode: :object) do
[123].group_by {}
end
Output:
$ ruby test.rb
test.rb:4: [BUG] Segmentation fault at 0x0000000000000000
ruby 2.6.0dev (2018-06-25 trunk 63747) [x86_64-linux]
-- Control frame information -----------------------------------------------
c:0007 p:---- s:0024 e:000023 IFUNC
c:0006 p:---- s:0021 e:000020 CFUNC :each
c:0005 p:---- s:0018 e:000017 CFUNC :group_by
c:0004 p:0007 s:0014 e:000013 BLOCK test.rb:4 [FINISH]
c:0003 p:---- s:0011 e:000010 CFUNC :run
c:0002 p:0020 s:0006 e:000005 EVAL test.rb:3 [FINISH]
c:0001 p:0000 s:0003 E:002330 (none) [FINISH]
-- Ruby level backtrace information ----------------------------------------
test.rb:3:in `<main>'
test.rb:3:in `run'
test.rb:4:in `block in <main>'
test.rb:4:in `group_by'
test.rb:4:in `each'
Segmentation fault
When using gdb:
$ gdb -q --args `rbenv which ruby` test.rb
Reading symbols from /home/kivikakk/.rbenv/versions/ruby-trunk/bin/ruby...done.
(gdb) run
Starting program: /home/kivikakk/.rbenv/versions/ruby-trunk/bin/ruby test.rb
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff7ff4700 (LWP 23118)]
Program received signal SIGSEGV, Segmentation fault.
0x00005555555c84ef in succ_index_lookup (sd=0xe0458d48e0458948, x=-454013225) at iseq.c:3089
3089 return imm_block_rank_get(sd->imm_part[i], j);
(gdb) bt 30
#0 0x00005555555c84ef in succ_index_lookup (sd=0xe0458d48e0458948, x=-454013225) at iseq.c:3089
#1 0x00005555555c2c3f in get_insn_info_succinct_bitvector (iseq=0x555555cd0b18, pos=428210124447829719) at iseq.c:1522
#2 0x00005555555c2c7b in get_insn_info (iseq=0x555555cd0b18, pos=428210124447829719) at iseq.c:1530
#3 0x00005555555c2ca0 in rb_iseq_line_no (iseq=0x555555cd0b18, pos=428210124447829719) at iseq.c:1598
#4 0x000055555571a13e in calc_lineno (iseq=0x555555cd0b18, pc=0x8) at vm_backtrace.c:37
#5 0x000055555571c1ec in rb_profile_frames (start=0, limit=2048, buff=0x7ffff6740530 <_stackprof+176>, lines=0x7ffff6744530 <_stackprof+16560>) at vm_backtrace.c:1295
#6 0x00007ffff653ee12 in stackprof_record_sample () at stackprof.c:484
#7 0x00007ffff653efe7 in stackprof_job_handler (data=0x0) at stackprof.c:539
#8 0x00007ffff653f0d3 in stackprof_newobj_handler (tpval=93825000082360, data=0x0) at stackprof.c:561
#9 0x000055555572086d in tp_call_trace (tpval=93825000082360, trace_arg=0x7fffffffb440) at vm_trace.c:1077
#10 0x000055555571ed6e in exec_hooks_body (ec=0x555555acf8c8, list=0x555555acf340, trace_arg=0x7fffffffb440) at vm_trace.c:272
#11 0x000055555571ee65 in exec_hooks_unprotected (ec=0x555555acf8c8, vm=0x555555acef20, list=0x555555acf340, trace_arg=0x7fffffffb440) at vm_trace.c:301
#12 0x000055555571f0ab in rb_exec_event_hooks (trace_arg=0x7fffffffb440, pop_p=0) at vm_trace.c:345
#13 0x00005555555895cc in rb_exec_event_hook_orig (ec=0x555555acf8c8, flag=1048576, self=93825000082320, id=0, called_id=0, klass=0, data=93825000082120, pop_p=0) at vm_core.h:1778
#14 0x000055555558afb7 in gc_event_hook_body (ec=0x555555acf8c8, objspace=0x555555acf650, event=1048576, data=93825000082120) at gc.c:1810
#15 0x000055555558b242 in newobj_slowpath (klass=93824998289320, flags=8199, v1=0, v2=0, v3=0, objspace=0x555555acf650, wb_protected=1) at gc.c:1924
#16 0x000055555558b294 in newobj_slowpath_wb_protected (klass=93824998289320, flags=8199, v1=0, v2=0, v3=0, objspace=0x555555acf650) at gc.c:1934
#17 0x000055555558b3c9 in newobj_of (klass=93824998289320, flags=8199, v1=0, v2=0, v3=0, wb_protected=1) at gc.c:1966
#18 0x000055555558b462 in rb_wb_protected_newobj_of (klass=93824998289320, flags=8199) at gc.c:1982
#19 0x0000555555727121 in ary_alloc (klass=93824998289320) at array.c:437
#20 0x00005555557271b4 in ary_new (klass=93824998289320, capa=1) at array.c:466
#21 0x00005555557274bb in rb_ary_tmp_new_from_values (klass=93824998289320, n=1, elts=0x7fffffffb700) at array.c:514
#22 0x00005555557275ae in rb_ary_new_from_values (n=1, elts=0x7fffffffb700) at array.c:526
#23 0x000055555578bf1c in group_by_i (i=247, hash=93825000082280, argc=1, argv=0x7fffffffb868, blockarg=8) at enum.c:894
#24 0x00005555557006cf in vm_yield_with_cfunc (ec=0x555555acf8c8, captured=0x7ffff6e53f10, self=93825000082320, argc=1, argv=0x7fffffffb868, block_handler=0) at vm_insnhelper.c:2569
#25 0x0000555555714697 in invoke_block_from_c_bh (ec=0x555555acf8c8, block_handler=140737335607059, argc=1, argv=0x7fffffffb868, passed_block_handler=0, cref=0x0, is_lambda=0,
force_blockarg=0) at vm.c:1079
#26 0x00005555557147f7 in vm_yield (ec=0x555555acf8c8, argc=1, argv=0x7fffffffb868) at vm.c:1119
#27 0x000055555571050e in rb_yield_0 (argc=1, argv=0x7fffffffb868) at vm_eval.c:972
#28 0x000055555571052d in rb_yield_1 (val=247) at vm_eval.c:978
#29 0x000055555571055f in rb_yield (val=247) at vm_eval.c:988
(More stack frames follow...)
Updated by shyouhei (Shyouhei Urabe) over 6 years ago
- Related to Bug #14809: calc_lineno() returns wrong location added
Updated by shyouhei (Shyouhei Urabe) over 6 years ago
This issue and #14809 are caused by the same problem.
Updated by shyouhei (Shyouhei Urabe) over 6 years ago
- Status changed from Assigned to Closed
Applied in changeset trunk|r63763.
give up insn attr handles_frame
I introduced this mechanism in r62051 to speed things up. Later it
was reported that the change causes problems. I searched for
workarounds but nothing seemed appropriate. I hereby officially
give it up. The idea to move ADD_PC around was a mistake.
Fixes [Bug #14809] and [Bug #14834].
Signed-off-by: Urabe, Shyouhei shyouhei@ruby-lang.org
Updated by jhawthorn (John Hawthorn) over 5 years ago
- Status changed from Closed to Open
I believe since r64737 this is an issue again.
On Ruby 2.6.3 I get an almost identical stacktrace as before
Thread 1 "ruby" received signal SIGSEGV, Segmentation fault.
0x00005555555cac9e in succ_index_lookup (x=<optimized out>, sd=0x10c483485c750000) at iseq.c:3357
3357 return imm_block_rank_get(sd->imm_part[i], j);
(gdb) bt
#0 0x00005555555cac9e in succ_index_lookup (x=<optimized out>, sd=0x10c483485c750000) at iseq.c:3357
#1 get_insn_info_succinct_bitvector (iseq=<optimized out>, pos=<optimized out>) at iseq.c:1593
#2 get_insn_info (pos=<optimized out>, iseq=<optimized out>) at iseq.c:1601
#3 rb_iseq_line_no (iseq=<optimized out>, pos=<optimized out>) at iseq.c:1669
#4 0x000055555571cf83 in calc_lineno (pc=<optimized out>, iseq=<optimized out>) at vm_backtrace.c:47
#5 rb_profile_frames (start=start@entry=0, limit=limit@entry=2048, buff=buff@entry=0x7fffffff86f0,
lines=lines@entry=0x7fffffff66f0) at vm_backtrace.c:1308
#6 0x00007ffff5532e83 in newobj (tpval=<optimized out>, ptr=0x555555feb610)
at ../../../../ext/allocation_sampler/allocation_sampler.c:227
#7 0x000055555571dc49 in tp_call_trace (tpval=<optimized out>, trace_arg=<optimized out>) at vm_trace.c:1194
#8 0x000055555571e206 in exec_hooks_body (ec=ec@entry=0x5555558c13b8, trace_arg=0x7fffffffc7d0, trace_arg@entry=0x7fffffffc790,
list=<optimized out>) at vm_trace.c:304
#9 0x0000555555720ebc in exec_hooks_unprotected (trace_arg=0x7fffffffc790, list=0x5555558c0e08, ec=<optimized out>)
at vm_trace.c:333
#10 rb_exec_event_hooks (trace_arg=trace_arg@entry=0x7fffffffc7d0, hooks=<optimized out>, pop_p=pop_p@entry=0) at vm_trace.c:378
#11 0x000055555558f72f in rb_exec_event_hook_orig (pop_p=0, data=<optimized out>, klass=0, called_id=0, id=0,
self=<optimized out>, flag=<optimized out>, hooks=<optimized out>, ec=0x5555558c13b8) at vm_core.h:1870
#12 gc_event_hook_body (ec=0x5555558c13b8, event=<optimized out>, data=<optimized out>, objspace=<optimized out>) at gc.c:1805
#13 0x0000555555599cf8 in newobj_slowpath (wb_protected=1, objspace=0x5555558c1140, v3=0, v2=<optimized out>, v1=0, flags=8199,
klass=<optimized out>) at vm_core.h:1735
#14 newobj_slowpath_wb_protected (klass=<optimized out>, flags=8199, v1=0, v2=<optimized out>, v3=0, objspace=0x5555558c1140)
at gc.c:1929
Updated by nagachika (Tomoyuki Chikanaga) over 5 years ago
- Backport changed from 2.3: UNKNOWN, 2.4: UNKNOWN, 2.5: UNKNOWN to 2.4: DONTNEED, 2.5: DONTNEED, 2.6: REQUIRED
Updated by shyouhei (Shyouhei Urabe) over 5 years ago
- Status changed from Open to Closed
Applied in changeset git|5d33f787169bcc3594d2264726695d58c4a06899.
fix tracepoint + backtrace SEGV
PC modification in gc_event_hook_body was careless. There are (so
to say) abnormal iseqs stored in the cfp. We have to check sanity
before we touch the PC.
This has not been fixed because there was no way to (ab)use the
setup from pure-Ruby. However by using our official C APIs it is
possible to touch such frame(s), resulting in SEGV.
Fixes [Bug #14834].
Updated by shyouhei (Shyouhei Urabe) over 5 years ago
Yes indeed. I was wrong, and the analysis by the OP is basically right. Pushed 5d33f787169bcc3594d2264726695d58c4a06899 which I hope should fix the problem. Sorry for the trouble.
Updated by tenderlovemaking (Aaron Patterson) over 5 years ago
Thanks for the fix @shyouhei (Shyouhei Urabe)!
Updated by nagachika (Tomoyuki Chikanaga) about 5 years ago
- Backport changed from 2.4: DONTNEED, 2.5: DONTNEED, 2.6: REQUIRED to 2.4: DONTNEED, 2.5: DONTNEED, 2.6: DONE
ruby_2_6 r67745 merged revision(s) 5d33f787169bcc3594d2264726695d58c4a06899,8b162ce9d1003e4e469d8f48cb9a2076fd45b47c.