Bug #21511
closedUse-after-free of the execution context after the fiber object carrying it is freed in GC
Description
In bootstraptest/test_thread.rb,
assert_equal 'ok', %{
File.write("zzz_t1.rb", <<-END)
begin
Thread.new { fork { GC.start } }.join
pid, status = Process.wait2
$result = status.success? ? :ok : :ng
rescue NotImplementedError
$result = :ok
end
END
require "./zzz_t1.rb"
$result
}
# in build/
make btest BTESTS="file_containing_above.rb"
# or
ruby --disable=gems "../bootstraptest/runner.rb" --ruby="./miniruby -I../lib -I. -I.ext/common -r./x86_64-linux-fake --disable-gems" file_containing_above.rb
Suppose thread 1 called the Thread.new
and created thread 2
The forked process by thread 2 that initiates GC with GC.start
would sweep the fiber object embedded in RTypedData
in the gc_sweep_rest()
stage of sweep in fiber_free()
. That fiber object contains the execution context of thread 1, rb_execution_context_t saved_ec
field of cont
.
Since the fiber object is freed, the allocated area pointed by it should be invalid, including the embedded struct for ec, but after thread 2 joins, thread 1 still uses the ec in rb_current_thread(), causing a use after free.
Updated by nobu (Nobuyoshi Nakada) 30 days ago
- Status changed from Open to Feedback
I can't reproduce it with ruby_3_4 (1e3d24a0f47) on aarch64-linux.
What version is commit:de8de51182?
Updated by tuonigou (tianyang sun) 30 days ago
nobu (Nobuyoshi Nakada) wrote in #note-1:
I can't reproduce it with ruby_3_4 (1e3d24a0f47) on aarch64-linux.
What version is commit:de8de51182?
Sorry I was using the 3.4.1 stable release from https://www.ruby-lang.org/en/downloads/, which is not on there anymore. The version number is just from adding version control, but I did not change anything to the codebase, sorry for the confusion.
Updated by nobu (Nobuyoshi Nakada) 30 days ago
- Status changed from Feedback to Open
3.4.1 is outdate.
Could you try with more recent version?
BTW, 3.4.1 tarball is not listed there, but still exists.
https://cache.ruby-lang.org/pub/ruby/3.4/ruby-3.4.1.tar.gz
And its RUBY_REVISION
in revision.h
is "48d4efcb85"
, and the date is 2024-12-25.
Updated by tuonigou (tianyang sun) 30 days ago
nobu (Nobuyoshi Nakada) wrote in #note-3:
3.4.1 is outdate.
Could you try with more recent version?BTW, 3.4.1 tarball is not listed there, but still exists.
https://cache.ruby-lang.org/pub/ruby/3.4/ruby-3.4.1.tar.gz
And itsRUBY_REVISION
inrevision.h
is"48d4efcb85"
, and the date is 2024-12-25.
Thank you I am trying to compile the current master branch and maybe the 3.4.4 release.
Updated by tuonigou (tianyang sun) 30 days ago
using ruby 3.5.0dev (2025-07-14T05:11:58Z master 8f54b5bb93) +PRISM [x86_64-linux]
# this is currently in the forked proc in GC (the `fork { GC.start }` part)
# thread 3.3 is the created thread by Thread.new
(gdb) i threads
Id Target Id Frame
1.1 Thread 0x7ffff7de4580 (LWP 2916144) "ruby" vfork ()
at ../sysdeps/unix/sysv/linux/x86_64/vfork.S:41
1.2 Thread 0x7fffdddff640 (LWP 2916147) "ruby" 0x00007ffff7b25e2e in epoll_wait (
epfd=4, events=0x555555b219fc <timer_th+28>, maxevents=16, timeout=-1)
at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
1.3 Thread 0x7fffdc0bd640 (LWP 2916150) "runner.rb:546" 0x00007ffff7b18bcf in __GI___poll (
fds=0x7fffdc0bb1f0, nfds=1, timeout=-1) at ../sysdeps/unix/sysv/linux/poll.c:29
3.1 Thread 0x7ffff7de4580 (LWP 2916151) "miniruby" __futex_abstimed_wait_common64 (
private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x555555b34350)
at ./nptl/futex-internal.c:57
3.3 Thread 0x7fffdc1be640 (LWP 2916153) "zzz_t1.rb:2" arch_fork (ctid=0x7fffdc1be910)
at ../sysdeps/unix/sysv/linux/arch-fork.h:52
* 4.1 Thread 0x7fffdc1be640 (LWP 2916154) "zzz_t1.rb:2" cont_free (ptr=0x555555ba84f0)
at ../cont.c:1094
# showing the ec's address in those two threads which will be used after freeing in thread 4.1's GC
(gdb) t 3.1
[Switching to thread 3.1 (Thread 0x7ffff7de4580 (LWP 2916151))]
#0 __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0,
futex_word=0x555555b34350) at ./nptl/futex-internal.c:57
57 ./nptl/futex-internal.c: No such file or directory.
(gdb) p ruby_current_ec
$6 = (struct rb_execution_context_struct *) 0x555555b3c230
(gdb) t 3.3
[Switching to thread 3.3 (Thread 0x7fffdc1be640 (LWP 2916153))]
#0 arch_fork (ctid=0x7fffdc1be910) at ../sysdeps/unix/sysv/linux/arch-fork.h:52
52 ../sysdeps/unix/sysv/linux/arch-fork.h: No such file or directory.
(gdb) p ruby_current_ec
$7 = (struct rb_execution_context_struct *) 0x555555ba8540
# the fiber_free in GC frees fiber that contains above ec's
Thread 4.1 "zzz_t1.rb:2" hit Breakpoint 3, fiber_free (ptr=0x555555b3c1e0) at ../cont.c:1170
1170 rb_fiber_t *fiber = ptr;
(gdb) p fiber
$1 = (rb_fiber_t *) 0x555555b3c1e0
ruby_xfree (x=0x555555b3c1e0) at ../gc.c:5301
5301 ruby_sized_xfree(x, 0);
(gdb) p x
$2 = (void *) 0x555555b3c1e0
Thread 4.1 "zzz_t1.rb:2" hit Breakpoint 3, fiber_free (ptr=0x555555ba84f0) at ../cont.c:1170
1170 rb_fiber_t *fiber = ptr;
(gdb) p fiber
$4 = (rb_fiber_t *) 0x555555ba84f0
1094 ruby_xfree(ptr);
(gdb) p ptr
$5 = (void *) 0x555555ba84f0
# after 4.1 exits
[Inferior 4 (process 2916154) exited normally]
# they were used
(gdb) awatch *0x555555b3c230
Hardware access (read/write) watchpoint 4: *0x555555b3c230
(gdb) awatch *0x555555ba8540
Hardware access (read/write) watchpoint 5: *0x555555ba8540
Thread 3.3 "zzz_t1.rb:2" hit Hardware access (read/write) watchpoint 5: *0x555555ba8540
Old value = -141955056
New value = 0
rb_ec_set_vm_stack (ec=0x555555ba8540, stack=0x0, size=0) at ../vm.c:3648
3648 ec->vm_stack_size = size;
(gdb) p ec
$8 = (rb_execution_context_t *) 0x555555ba8540
Thread 3.1 "miniruby" hit Hardware access (read/write) watchpoint 4: *0x555555b3c230
Value = -137850864
0x00005555558a0bd1 in ruby_vm_destruct (vm=0x555555b35310) at ../vm.c:3144
3144 VALUE *stack = th->ec->vm_stack;
(gdb) p th->ec
$9 = (rb_execution_context_t *) 0x555555b3c230
Thread 3.1 "miniruby" hit Hardware access (read/write) watchpoint 4: *0x555555b3c230
Old value = -137850864
New value = 0
rb_ec_set_vm_stack (ec=0x555555b3c230, stack=0x0, size=0) at ../vm.c:3648
3648 ec->vm_stack_size = size;
(gdb) p ec
$10 = (rb_execution_context_t *) 0x555555b3c230
sizeof rb_fiber_t = 0x250, so
sizeof(struct rb_execution_context_struct) = 0x170
[0x555555b3c230-0x555555B3C3A0] is within [0x555555b3c1e0-0x555555B3C430] the range of rb_fiber_t which is freed
[0x555555ba8540-0x555555BA86B0] is within [0x555555ba84f0-0x555555BA8740] the range of rb_fiber_t which is freed
Updated by luke-gru (Luke Gruber) 29 days ago
ยท Edited
GC inside a forked process should not affect the parent. Are you getting a crash and a stack trace from running this program? If so, it would be helpful if you uploaded the stack trace.
Updated by tuonigou (tianyang sun) 28 days ago
luke-gru (Luke Gruber) wrote in #note-6:
GC inside a forked process should not affect the parent. Are you getting a crash and a stack trace from running this program? If so, it would be helpful if you uploaded the stack trace.
No, I am not experiencing crash from this. The test also passes, and it seems that it has minimum affect on the program. I just found that the ec is used, and it would be good if this could be fixed.
here's the backtrace for each time the ec is being referenced after the GC, and the bt in the forked process just before fiber free.
(gdb) p ruby_current_ec
$1 = (struct rb_execution_context_struct *) 0x555555b3b230
(gdb) t 3.3
[Switching to thread 3.3 (Thread 0x7fffdc1be640 (LWP 2954909))]
#0 arch_fork (ctid=0x7fffdc1be910) at ../sysdeps/unix/sysv/linux/arch-fork.h:52
52 ../sysdeps/unix/sysv/linux/arch-fork.h: No such file or directory.
(gdb) p ruby_current_ec
$2 = (struct rb_execution_context_struct *) 0x555555ba7540
(gdb) awatch *0x555555b3b230
Hardware access (read/write) watchpoint 2: *0x555555b3b230
(gdb) awatch *0x555555ba7540
Hardware access (read/write) watchpoint 3: *0x555555ba7540
(gdb) c
Continuing.
[New Thread 0x7fffdddff640 (LWP 2954912)]
Thread 3.3 "zzz_t1.rb:2" hit Hardware access (read/write) watchpoint 3: *0x555555ba7540
Old value = -141955056
New value = 0
rb_ec_set_vm_stack (ec=0x555555ba7540, stack=0x0, size=0) at ../vm.c:3648
3648 ec->vm_stack_size = size;
(gdb) bt
#0 rb_ec_set_vm_stack (ec=0x555555ba7540, stack=0x0, size=0) at ../vm.c:3648
#1 0x00005555558a1707 in rb_ec_clear_vm_stack (ec=0x555555ba7540) at ../vm.c:3677
#2 0x0000555555634122 in rb_threadptr_root_fiber_terminate (th=0x555555baf7e0) at ../cont.c:2591
#3 0x0000555555829165 in thread_cleanup_func_before_exec (th_ptr=0x555555baf7e0)
at ../thread.c:511
#4 0x000055555582919e in thread_cleanup_func (th_ptr=0x555555baf7e0, atfork=0) at ../thread.c:520
#5 0x0000555555829d54 in thread_start_func_2 (th=0x555555baf7e0, stack_start=0x7fffdc1bddd8)
at ../thread.c:778
#6 0x0000555555822a39 in call_thread_start_func_2 (th=0x555555baf7e0) at ../thread_pthread.c:2237
#7 0x0000555555822b5d in nt_start (ptr=0x555555c43150) at ../thread_pthread.c:2282
#8 0x00007ffff7a94ac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#9 0x00007ffff7b26850 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
(gdb) c
Continuing.
[LWP 2954909 exited]
[LWP 2954912 exited]
[Switching to Thread 0x7ffff7de4580 (LWP 2954907)]
Thread 3.1 "miniruby" hit Hardware access (read/write) watchpoint 2: *0x555555b3b230
Value = -137850864
0x00005555558a05ad in ruby_vm_destruct (vm=0x555555b34310) at ../vm.c:3144
3144 VALUE *stack = th->ec->vm_stack;
(gdb) bt
#0 0x00005555558a05ad in ruby_vm_destruct (vm=0x555555b34310) at ../vm.c:3144
#1 0x0000555555668327 in rb_ec_cleanup (ec=0x555555b3b230, ex=RUBY_TAG_NONE) at ../eval.c:264
#2 0x0000555555668618 in ruby_run_node (n=0x7ffff79b3148) at ../eval.c:320
#3 0x0000555555585784 in rb_main (argc=8, argv=0x7fffffffe848) at ../main.c:42
#4 0x00005555555857fc in main (argc=8, argv=0x7fffffffe848) at ../main.c:62
(gdb) c
Continuing.
Thread 3.1 "miniruby" hit Hardware access (read/write) watchpoint 2: *0x555555b3b230
Old value = -137850864
New value = 0
rb_ec_set_vm_stack (ec=0x555555b3b230, stack=0x0, size=0) at ../vm.c:3648
3648 ec->vm_stack_size = size;
(gdb) bt
#0 rb_ec_set_vm_stack (ec=0x555555b3b230, stack=0x0, size=0) at ../vm.c:3648
#1 0x00005555558a1707 in rb_ec_clear_vm_stack (ec=0x555555b3b230) at ../vm.c:3677
#2 0x0000555555631b6c in fiber_stack_release (fiber=0x555555b3b1e0) at ../cont.c:923
#3 0x0000555555631ea9 in cont_free (ptr=0x555555b3b1e0) at ../cont.c:1086
#4 0x000055555563214e in fiber_free (ptr=0x555555b3b1e0) at ../cont.c:1179
#5 0x00005555556340cf in rb_threadptr_root_fiber_release (th=0x555555b32ac0) at ../cont.c:2578
#6 0x00005555558a1500 in thread_free (ptr=0x555555b32ac0) at ../vm.c:3590
#7 0x00005555558a0698 in ruby_vm_destruct (vm=0x555555b34310) at ../vm.c:3180
#8 0x0000555555668327 in rb_ec_cleanup (ec=0x555555b3b230, ex=RUBY_TAG_NONE) at ../eval.c:264
#9 0x0000555555668618 in ruby_run_node (n=0x7ffff79b3148) at ../eval.c:320
#10 0x0000555555585784 in rb_main (argc=8, argv=0x7fffffffe848) at ../main.c:42
#11 0x00005555555857fc in main (argc=8, argv=0x7fffffffe848) at ../main.c:62
Thread 4.1 "zzz_t1.rb:2" hit Breakpoint 3, fiber_free (ptr=0x555555b3b1e0) at ../cont.c:1170
1170 rb_fiber_t *fiber = ptr;
(gdb) bt
#0 fiber_free (ptr=0x555555b3b1e0) at ../cont.c:1170
#1 0x00005555556340cf in rb_threadptr_root_fiber_release (th=0x555555b32ac0) at ../cont.c:2578
#2 0x00005555558a1500 in thread_free (ptr=0x555555b32ac0) at ../vm.c:3590
#3 0x000055555568f87e in rb_data_free (objspace=0x555555b368b0, obj=140737347626040)
at ../gc.c:1181
#4 0x000055555568fc5c in rb_gc_obj_free (objspace=0x555555b368b0, obj=140737347626040)
at ../gc.c:1352
#5 0x0000555555681b88 in gc_sweep_plane (objspace=0x555555b368b0, heap=0x555555b368d0,
p=140737347626040, bitset=1, ctx=0x7fffdc1bb410) at ../gc/default/default.c:3476
#6 0x0000555555681e6b in gc_sweep_page (objspace=0x555555b368b0, heap=0x555555b368d0,
ctx=0x7fffdc1bb410) at ../gc/default/default.c:3561
#7 0x0000555555682715 in gc_sweep_step (objspace=0x555555b368b0, heap=0x555555b368d0)
at ../gc/default/default.c:3842
#8 0x000055555568298a in gc_sweep_rest (objspace=0x555555b368b0) at ../gc/default/default.c:3910
#9 0x0000555555683013 in gc_sweep (objspace=0x555555b368b0) at ../gc/default/default.c:4080
#10 0x0000555555688100 in gc_start (objspace=0x555555b368b0, reason=107528)
at ../gc/default/default.c:6426
#11 0x0000555555687d99 in garbage_collect (objspace=0x555555b368b0, reason=107520)
at ../gc/default/default.c:6307
#12 0x00005555556889dd in rb_gc_impl_start (objspace_ptr=0x555555b368b0, full_mark=true,
immediate_mark=true, immediate_sweep=true, compact=false) at ../gc/default/default.c:6759
#13 0x00005555556937e5 in gc_start_internal (ec=0x555555ba7540, self=140737347700600,
full_mark=20, immediate_mark=20, immediate_sweep=20, compact=0) at ../gc.c:3504
#14 0x00005555558819a3 in builtin_invoker4 (ec=0x555555ba7540, self=140737347700600,
argv=0x7ffff789f0b8, funcptr=0x555555693763 <gc_start_internal>) at ../vm_insnhelper.c:7308
#15 0x0000555555882272 in invoke_bf (ec=0x555555ba7540, reg_cfp=0x7ffff799eef8,
bf=0x555555b128a0 <gc_table>, argv=0x7ffff789f0b8) at ../vm_insnhelper.c:7421
#16 0x00005555558822b0 in vm_invoke_builtin (ec=0x555555ba7540, cfp=0x7ffff799eef8,
bf=0x555555b128a0 <gc_table>, argv=0x7ffff789f0b8) at ../vm_insnhelper.c:7429
--Type <RET> for more, q to quit, c to continue without paging--
#17 0x00005555558897cf in vm_exec_core (ec=0x555555ba7540) at ../insns.def:1676
#18 0x000055555589ec76 in rb_vm_exec (ec=0x555555ba7540) at ../vm.c:2621
#19 0x000055555589b95d in invoke_iseq_block_from_c (me=0x0, is_lambda=0, cref=0x0,
passed_block_handler=0, kw_splat=0, argv=0x0, argc=0, self=140737350192440,
captured=0x7ffff799efb8, ec=0x555555ba7540) at ../vm.c:1651
#20 invoke_block_from_c_bh (ec=0x555555ba7540, block_handler=140737347448761, argc=0, argv=0x0,
kw_splat=0, passed_block_handler=0, cref=0x0, is_lambda=0, force_blockarg=0) at ../vm.c:1665
#21 0x000055555589bb19 in vm_yield_with_cref (ec=0x555555ba7540, argc=0, argv=0x0, kw_splat=0,
cref=0x0, is_lambda=0) at ../vm.c:1702
#22 0x000055555589bb57 in vm_yield (ec=0x555555ba7540, argc=0, argv=0x0, kw_splat=0)
at ../vm.c:1710
#23 0x0000555555895745 in rb_yield_0 (argc=0, argv=0x0) at ../vm_eval.c:1362
#24 0x0000555555895799 in rb_yield (val=36) at ../vm_eval.c:1375
#25 0x000055555566a07e in rb_protect (proc=0x55555589576a <rb_yield>, data=36,
pstate=0x7fffdc1bc770) at ../eval.c:1060
#26 0x0000555555766714 in rb_f_fork (obj=140737350192440) at ../process.c:4290
#27 0x0000555555876b60 in ractor_safe_call_cfunc_0 (recv=140737350192440, argc=0,
argv=0x7ffff789f048, func=0x5555557666c3 <rb_f_fork>) at ../vm_insnhelper.c:3600
#28 0x0000555555877769 in vm_call_cfunc_with_frame_ (ec=0x555555ba7540, reg_cfp=0x7ffff799efa0,
calling=0x7fffdc1bcbd0, argc=0, argv=0x7ffff789f048, stack_bottom=0x7ffff789f040)
at ../vm_insnhelper.c:3784
#29 0x00005555558779e1 in vm_call_cfunc_with_frame (ec=0x555555ba7540, reg_cfp=0x7ffff799efa0,
calling=0x7fffdc1bcbd0) at ../vm_insnhelper.c:3830
#30 0x0000555555877b0e in vm_call_cfunc_other (ec=0x555555ba7540, reg_cfp=0x7ffff799efa0,
calling=0x7fffdc1bcbd0) at ../vm_insnhelper.c:3856
#31 0x0000555555877f56 in vm_call_cfunc (ec=0x555555ba7540, reg_cfp=0x7ffff799efa0,
calling=0x7fffdc1bcbd0) at ../vm_insnhelper.c:3938
#32 0x000055555587acac in vm_call_method_each_type (ec=0x555555ba7540, cfp=0x7ffff799efa0,
calling=0x7fffdc1bcbd0) at ../vm_insnhelper.c:4763
--Type <RET> for more, q to quit, c to continue without paging--
#33 0x000055555587b7fd in vm_call_method (ec=0x555555ba7540, cfp=0x7ffff799efa0,
calling=0x7fffdc1bcbd0) at ../vm_insnhelper.c:4900
#34 0x000055555587b96a in vm_call_general (ec=0x555555ba7540, reg_cfp=0x7ffff799efa0,
calling=0x7fffdc1bcbd0) at ../vm_insnhelper.c:4933
#35 0x000055555587e1ad in vm_sendish (ec=0x555555ba7540, reg_cfp=0x7ffff799efa0,
cd=0x555555baf340, block_handler=140737347448761, method_explorer=mexp_search_method)
at ../vm_insnhelper.c:5991
#36 0x0000555555885aba in vm_exec_core (ec=0x555555ba7540) at ../insns.def:851
#37 0x000055555589ec76 in rb_vm_exec (ec=0x555555ba7540) at ../vm.c:2621
#38 0x000055555589bfe2 in invoke_iseq_block_from_c (me=0x0, is_lambda=0, cref=0x0,
passed_block_handler=0, kw_splat=0, argv=0x7fffdc1bdc10, argc=0, self=140737350192440,
captured=0x555555bb1790, ec=0x555555ba7540) at ../vm.c:1651
#39 invoke_block_from_c_proc (me=0x0, is_lambda=0, passed_block_handler=0, kw_splat=0,
argv=0x7fffdc1bdc10, argc=0, self=140737350192440, proc=0x555555bb1790, ec=0x555555ba7540)
at ../vm.c:1745
#40 vm_invoke_proc (ec=0x555555ba7540, proc=0x555555bb1790, self=140737350192440, argc=0,
argv=0x7fffdc1bdc10, kw_splat=0, passed_block_handler=0) at ../vm.c:1775
#41 0x000055555589c757 in rb_vm_invoke_proc (ec=0x555555ba7540, proc=0x555555bb1790, argc=0,
argv=0x7fffdc1bdc10, kw_splat=0, passed_block_handler=0) at ../vm.c:1796
#42 0x000055555582957e in thread_do_start_proc (th=0x555555baf7e0) at ../thread.c:604
#43 0x00005555558295f3 in thread_do_start (th=0x555555baf7e0) at ../thread.c:621
#44 0x00005555558298f6 in thread_start_func_2 (th=0x555555baf7e0, stack_start=0x7fffdc1bddd8)
at ../thread.c:676
#45 0x0000555555822a39 in call_thread_start_func_2 (th=0x555555baf7e0) at ../thread_pthread.c:2237
#46 0x0000555555822b5d in nt_start (ptr=0x555555c43150) at ../thread_pthread.c:2282
#47 0x00007ffff7a94ac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#48 0x00007ffff7b26850 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
Updated by luke-gru (Luke Gruber) 27 days ago
The fiber is getting freed in the forked process, but it is not the same physical address as the fiber in the parent process. You are seeing virtual addresses here, these processes don't share memory.
Updated by tuonigou (tianyang sun) 27 days ago
luke-gru (Luke Gruber) wrote in #note-8:
The fiber is getting freed in the forked process, but it is not the same physical address as the fiber in the parent process. You are seeing virtual addresses here, these processes don't share memory.
Yeah, that is true. OS course's feeling so distant.
Updated by luke-gru (Luke Gruber) 27 days ago
- Status changed from Open to Closed