Bug #13882
closedException in `ensure` stops threads from exiting
Description
When the Ruby process is gracefully exiting, if a thread has an exception during an ensure
block it appears the Ruby process forgets it's attempting to exit and will keep running forever. Since there's still an alive thread, rb_thread_terminate_all
doesn't finish (since vm_living_thread_num(vm) > 1
is still true), and the Ruby process never exits until you kill -9
it.
I was able to cause this going back as far as MRI 2.0.0, but didn't have a 1.9.3 install to double check with. Repo case:
Thread.new do
loop do
puts "Loop start"
begin
begin
sleep
ensure
raise
end
rescue => e
p e
end
end
end
sleep 1
exit
Will result in a two Loop start
messages.
When running GDB on the process, we see Ruby is waiting on sleep_forever
which is expected, but you can see the main thread is stuck on rb_thread_terminate_all
:
(gdb) t a a bt
Thread 3 (Thread 0x7f956f36e700 (LWP 401088)):
#0 0x00000033d90df113 in poll () from /lib64/libc.so.6
#1 0x00007f9575863775 in timer_thread_sleep (p=0x7f957524c008) at thread_pthread.c:1460
#2 thread_timer (p=0x7f957524c008) at thread_pthread.c:1568
#3 0x00000033d9407aa1 in start_thread () from /lib64/libpthread.so.0
#4 0x00000033d90e893d in clone () from /lib64/libc.so.6
Thread 2 (Thread 0x7f956f363700 (LWP 401117)):
#0 0x00000033d940b68c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00007f957585d959 in native_cond_wait (cond=<value optimized out>, mutex=<value optimized out>) at thread_pthread.c:343
#2 0x00007f95758675b8 in native_sleep (th=0x7f956c491800, timeout_tv=0x0) at thread_pthread.c:1147
#3 0x00007f95758682f2 in sleep_forever () at thread.c:1083
#4 rb_thread_sleep_forever () at thread.c:1157
#5 0x00007f95757e1ca5 in rb_f_sleep (argc=0, argv=0x7f956bce6a78) at process.c:4393
#6 0x00007f9575890bda in vm_call_cfunc_with_frame (th=0x7f956c491800, reg_cfp=0x7f956bde6940, calling=<value optimized out>, ci=0x7f956e98c910, cc=<value optimized out>)
at vm_insnhelper.c:1752
#7 vm_call_cfunc (th=0x7f956c491800, reg_cfp=0x7f956bde6940, calling=<value optimized out>, ci=0x7f956e98c910, cc=<value optimized out>) at vm_insnhelper.c:1847
#8 0x00007f957589f12b in vm_exec_core (th=<value optimized out>, initial=<value optimized out>) at insns.def:1066
#9 0x00007f95758a470b in vm_exec (th=0x7f956c491800) at vm.c:1727
#10 0x00007f95758abc04 in invoke_block () at vm.c:969
#11 invoke_iseq_block_from_c () at vm.c:1014
#12 invoke_block_from_c_splattable () at vm.c:1032
#13 vm_yield () at vm.c:1074
#14 rb_yield_0 () at vm_eval.c:1010
#15 loop_i () at vm_eval.c:1088
#16 0x00007f9575755954 in rb_rescue2 (b_proc=0x7f95758ab8b0 <loop_i>, data1=0, r_proc=0x7f957588b2a0 <loop_stop>, data2=0) at eval.c:838
#17 0x00007f9575890bda in vm_call_cfunc_with_frame (th=0x7f956c491800, reg_cfp=0x7f956bde69a0, calling=<value optimized out>, ci=0x7f956c5090f0, cc=<value optimized out>)
at vm_insnhelper.c:1752
#18 vm_call_cfunc (th=0x7f956c491800, reg_cfp=0x7f956bde69a0, calling=<value optimized out>, ci=0x7f956c5090f0, cc=<value optimized out>) at vm_insnhelper.c:1847
#19 0x00007f95758a726b in vm_call_method (th=0x7f956c491800, cfp=0x7f956bde69a0, calling=<value optimized out>, ci=<value optimized out>, cc=<value optimized out>)
at vm_insnhelper.c:2295
#20 0x00007f957589f97c in vm_exec_core (th=<value optimized out>, initial=<value optimized out>) at insns.def:967
#21 0x00007f95758a470b in vm_exec (th=0x7f956c491800) at vm.c:1727
#22 0x00007f95758a5771 in invoke_block (th=0x7f956c491800, captured=<value optimized out>, self=140279789266160, argc=<value optimized out>, argv=<value optimized out>,
passed_block_handler=<value optimized out>, cref=0x0, splattable=0, is_lambda=0) at vm.c:969
#23 invoke_iseq_block_from_c (th=0x7f956c491800, captured=<value optimized out>, self=140279789266160, argc=<value optimized out>, argv=<value optimized out>,
passed_block_handler=<value optimized out>, cref=0x0, splattable=0, is_lambda=0) at vm.c:1014
#24 0x00007f95758a581f in invoke_block_from_c_unsplattable (th=<value optimized out>, block=<value optimized out>, self=<value optimized out>, argc=<value optimized out>,
argv=<value optimized out>, passed_block_handler=<value optimized out>, is_lambda=<value optimized out>) at vm.c:1101
#25 0x00007f95758a595a in vm_invoke_proc (th=0x7f956c491800, proc=0x7f956e04da50, self=140279789266160, argc=0, argv=0x7f956c547c18, passed_block_handler=0) at vm.c:1126
#26 0x00007f9575864589 in thread_do_start (th=0x7f956c491800, stack_start=0x7f956f364000) at thread.c:577
#27 thread_start_func_2 (th=0x7f956c491800, stack_start=0x7f956f364000) at thread.c:619
#28 0x00007f95758649b6 in thread_start_func_1 (th_ptr=0x7f956c491800) at thread_pthread.c:887
#29 0x00000033d9407aa1 in start_thread () from /lib64/libpthread.so.0
#30 0x00000033d90e893d in clone () from /lib64/libc.so.6
Thread 1 (Thread 0x7f95756ab760 (LWP 401080)):
#0 0x00000033d940b68c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00007f957585d959 in native_cond_wait (cond=<value optimized out>, mutex=<value optimized out>) at thread_pthread.c:343
#2 0x00007f95758675b8 in native_sleep (th=0x7f9575229400, timeout_tv=0x0) at thread_pthread.c:1147
#3 0x00007f9575867e34 in rb_thread_terminate_all () at thread.c:494
#4 0x00007f9575758466 in ruby_cleanup (ex=6) at eval.c:186
#5 0x00007f9575758725 in ruby_run_node (n=0x7f956c54c6b8) at eval.c:300
#6 0x00000000004008eb in main (argc=2, argv=0x7fffccd8e7c8) at main.c:36
(gdb)
Looking at the thread state shows that the rb_threadptr_to_kill
executed properly, because to_kill
is properly set, but the errinfo
was nilled out:
(gdb) p ruby_current_thread
$1 = (rb_thread_t *) 0x7f8cd06eb800
(gdb) p ruby_current_thread->to_kill
$2 = 1
(gdb) p ruby_current_thread->status
$3 = THREAD_STOPPED
(gdb) p ruby_current_thread->errinfo
$4 = 8
I'm happy to contribute a patch, but not quite sure what the appropriate fix for this would be. It looks like the issue is an exception in ensure
is causing the TAG_FATAL
on errinfo
to be overwritten so the thread doesn't think it should exit, and the right fix is to also check if to_kill
is set.
Files