Bug #21127
Updated by pocke (Masataka Kuwabara) 12 days ago
Previously, Ruby displayed backtraces for each thread on deadlock. However, it has not been shown since Ruby 3.0. It should display the backtrace for debugging. (related: https://bugs.ruby-lang.org/issues/8214) ## Step to reproduce ```ruby def f(q) g(q) end def g(q) h(q) end def h(q) q.pop end q = Thread::Queue.new th1 = Thread.new { f q } th2 = Thread.new { f q } th1.join th2.join ``` Ruby 2.7.8: It displays the trace. ``` ruby-2.7.8 ruby 2.7.8p225 (2023-03-30 revision 1f4d455848) [x86_64-linux] -e:9:in `join': No live threads left. Deadlock? (fatal) 3 threads, 3 sleeps current:0x0000004000653e80 main thread:0x000000400035da50 * #<Thread:0x000000400038ad70 sleep_forever> rb_thread_t:0x000000400035da50 native:0x0000004002fb7740 int:0 -e:9:in `join' -e:9:in `<main>' * #<Thread:0x00000040004fe760 -e:6 sleep_forever> rb_thread_t:0x00000040004518a0 native:0x00000040067ae700 int:0 depended by: tb_thread_id:0x000000400035da50 -e:3:in `pop' -e:3:in `h' -e:2:in `g' -e:1:in `f' -e:6:in `block in <main>' * #<Thread:0x00000040004fe620 -e:7 sleep_forever> rb_thread_t:0x0000004000653e80 native:0x000000400e9af700 int:0 -e:3:in `pop' -e:3:in `h' -e:2:in `g' -e:1:in `f' -e:7:in `block in <main>' from -e:9:in `<main>' ``` Ruby 3.0.0: There is no trace. ``` ruby-3.0.0 ruby 3.0.0p0 (2020-12-25 revision 95aff21468) [x86_64-linux] -e:9:in `join': No live threads left. Deadlock? (fatal) 3 threads, 3 sleeps current:0x00000040006e88c0 main thread:0x00000040003ffa10 * #<Thread:0x000000400046f448 sleep_forever> rb_thread_t:0x00000040003ffa10 native:0x0000004003059740 int:0 * #<Thread:0x000000400073c2e8 -e:6 sleep_forever> rb_thread_t:0x00000040006e15a0 native:0x0000004006850700 int:0 depended by: tb_thread_id:0x00000040003ffa10 * #<Thread:0x000000400073c1f8 -e:7 sleep_forever> rb_thread_t:0x00000040006e88c0 native:0x000000400ea51700 int:0 from -e:9:in `<main>' ``` Ruby 3.5.0.dev (latest): There is no trace. ``` ruby 3.5.0dev (2025-02-10T00:27:18Z master 135479a58d) +PRISM [arm64-darwin24] -e:9:in 'Thread#join': No live threads left. Deadlock? (fatal) 3 threads, 3 sleeps current:0x000000012ae3b530 main thread:0x000000012af04080 * #<Thread:0x0000000104aaaf78 sleep_forever> rb_thread_t:0x000000012af04080 native:0x00000001e9d93240 int:0 * #<Thread:0x0000000104e5e048 -e:6 sleep_forever> rb_thread_t:0x000000012ae23a90 native:0x000000016b5f3000 int:0 depended by: tb_thread_id:0x000000012af04080 * #<Thread:0x0000000104e5deb8 -e:7 sleep_forever> rb_thread_t:0x000000012ae3b530 native:0x000000016b6ff000 int:0 from -e:9:in '<main>' ``` Note: The results of 2.7 and 3.0 are indented because I executed them with all-ruby. ## Cause This commit has introduced this change. https://github.com/ruby/ruby/commit/3b24b7914c16930bfadc89d6aff6326a51c54295 The ticket is https://bugs.ruby-lang.org/issues/17031 It changes `rb_ec_backtrace_str_ary` function in `vm_backtrace.c`. Previously, `rb_ec_backtrace_str_ary(ec, 0, 0)` returns all traces, but it returns an empty array with this change. It should pass `ALL_BACKTRACE_LINES` (`-1`) to get all traces. I used rubyfarm also to confirm that this commit was the cause. With the commit, it does not display the traces. ``` docker run --rm -ti rubylang/rubyfarm:3b24b7914c16930bfadc89d6aff6326a51c54295 ruby -ve 'def f(q) g(q) end def g(q) h(q) end def h(q) q.pop end q = Thread::Queue.new th1 = Thread.new { f q } th2 = Thread.new { f q } th1.join th2.join' ruby 2.8.0dev (2020-08-27T22:17:36Z :detached: 3b24b7914c) [x86_64-linux] -e:9:in `join': No live threads left. Deadlock? (fatal) 3 threads, 3 sleeps current:0x0000004000566a80 main thread:0x0000004000202a60 * #<Thread:0x000000400026faf8 sleep_forever> rb_thread_t:0x0000004000202a60 native:0x000000400440d740 int:0 * #<Thread:0x00000040004f1768 -e:6 sleep_forever> rb_thread_t:0x0000004000566710 native:0x00000040081fe700 int:0 depended by: tb_thread_id:0x0000004000202a60 * #<Thread:0x00000040004f1600 -e:7 sleep_forever> rb_thread_t:0x0000004000566a80 native:0x00000040103ff700 int:0 from -e:9:in `<main>' ``` With the parent commit, it displays the traces. ``` docker run --rm -ti rubylang/rubyfarm:8095114f1715070fcdc2b29303dcf55a7fcc32a3 ruby -ve 'def f(q) g(q) end def g(q) h(q) end def h(q) q.pop end q = Thread::Queue.new th1 = Thread.new { f q } th2 = Thread.new { f q } th1.join th2.join' ruby 2.8.0dev (2020-08-27T19:54:36Z :detached: 8095114f17) [x86_64-linux] -e:9:in `join': No live threads left. Deadlock? (fatal) 3 threads, 3 sleeps current:0x000000400055b3a0 main thread:0x0000004000202a60 * #<Thread:0x000000400026faf8 sleep_forever> rb_thread_t:0x0000004000202a60 native:0x000000400440d740 int:0 -e:9:in `join' -e:9:in `<main>' * #<Thread:0x00000040004f1768 -e:6 sleep_forever> rb_thread_t:0x000000400055afe0 native:0x00000040081fe700 int:0 depended by: tb_thread_id:0x0000004000202a60 -e:3:in `pop' -e:3:in `h' -e:2:in `g' -e:1:in `f' -e:6:in `block in <main>' * #<Thread:0x00000040004f1600 -e:7 sleep_forever> rb_thread_t:0x000000400055b3a0 native:0x00000040103ff700 int:0 -e:3:in `pop' -e:3:in `h' -e:2:in `g' -e:1:in `f' -e:7:in `block in <main>' from -e:9:in `<main>' ``` I think it's not an intentional change. The ticket and NEWS do not mention the backtrace on a deadlock. (And the backtrace is really helpful!) ## Solution The following patch fixes the problem. PR: https://github.com/ruby/ruby/pull/12721 I'll make a PR on GitHub soon. ```diff diff --git a/eval_intern.h b/eval_intern.h index 9a551120ff..f783c93aff 100644 --- a/eval_intern.h +++ b/eval_intern.h @@ -300,6 +300,8 @@ void rb_vm_set_progname(VALUE filename); VALUE rb_vm_cbase(void); /* vm_backtrace.c */ +#define BACKTRACE_START 0 +#define ALL_BACKTRACE_LINES -1 VALUE rb_ec_backtrace_object(const rb_execution_context_t *ec); VALUE rb_ec_backtrace_str_ary(const rb_execution_context_t *ec, long lev, long n); VALUE rb_ec_backtrace_location_ary(const rb_execution_context_t *ec, long lev, long n, bool skip_internal); diff --git a/thread.c b/thread.c index d40d5140b7..5280c667af 100644 --- a/thread.c +++ b/thread.c @@ -5596,7 +5596,7 @@ debug_deadlock_check(rb_ractor_t *r, VALUE msg) } } rb_str_catf(msg, "\n "); - rb_str_concat(msg, rb_ary_join(rb_ec_backtrace_str_ary(th->ec, 0, 0), sep)); + rb_str_concat(msg, rb_ary_join(rb_ec_backtrace_str_ary(th->ec, BACKTRACE_START, ALL_BACKTRACE_LINES), sep)); rb_str_catf(msg, "\n"); } } diff --git a/vm_backtrace.c b/vm_backtrace.c index 56a0d861d6..6f2cf9e169 100644 --- a/vm_backtrace.c +++ b/vm_backtrace.c @@ -31,9 +31,6 @@ id2str(ID id) } #define rb_id2str(id) id2str(id) -#define BACKTRACE_START 0 -#define ALL_BACKTRACE_LINES -1 - inline static int calc_pos(const rb_iseq_t *iseq, const VALUE *pc, int *lineno, int *node_id) { ``` The result is here: ``` $ ruby -ve 'def f(q) g(q) end def g(q) h(q) end def h(q) q.pop end q = Thread::Queue.new th1 = Thread.new { f q } th2 = Thread.new { f q } th1.join th2.join' ruby 3.5.0dev (2025-02-10T00:27:18Z origin/master 135479a58d) +PRISM [arm64-darwin24] -e:9:in 'Thread#join': No live threads left. Deadlock? (fatal) 3 threads, 3 sleeps current:0x0000000157a053a0 main thread:0x0000000156704080 * #<Thread:0x000000011bcfaf78 sleep_forever> rb_thread_t:0x0000000156704080 native:0x00000001e9d93240 int:0 -e:9:in 'Thread#join' -e:9:in '<main>' * #<Thread:0x000000011d2d0bb8 -e:6 sleep_forever> rb_thread_t:0x0000000157a04d40 native:0x000000016dcb7000 int:0 depended by: tb_thread_id:0x0000000156704080 <internal:thread_sync>:18:in 'Thread::Queue#pop' -e:3:in 'Object#h' -e:2:in 'Object#g' -e:1:in 'Object#f' -e:6:in 'block in <main>' * #<Thread:0x000000011d2d07a8 -e:7 sleep_forever> rb_thread_t:0x0000000157a053a0 native:0x000000016ddc3000 int:0 <internal:thread_sync>:18:in 'Thread::Queue#pop' -e:3:in 'Object#h' -e:2:in 'Object#g' -e:1:in 'Object#f' -e:7:in 'block in <main>' from -e:9:in '<main>' ```