Project

General

Profile

Actions

Bug #20286

closed

TracePoint does not emit `thread_end` event when thread exits with exception

Added by ioquatix (Samuel Williams) 9 months ago. Updated 5 months ago.


Description

Using TracePoint to trace thread_begin and thread_end events fails to emit the thread_end event when an exception (e.g., Interrupt) is raised within a thread. This behavior occurs because the exception handling bypasses the internal thread finishing logic, including trace point and fiber scheduler cleanup code. This issue affects the ability to accurately monitor thread lifecycle events in scenarios involving exception handling or abrupt thread terminations.

Steps to Reproduce:

  1. Set up TracePoint to trace thread_begin and thread_end events.
  2. Create a new thread that raises an exception.
  3. Join the thread and observe that only the thread_begin event is emitted without a corresponding thread_end event.

Example Code

TracePoint.trace(:thread_begin, :thread_end) do |tp|
  p [tp.event, tp.lineno, tp.path, tp.defined_class, tp.method_id]
end

thread = Thread.new do
  raise Interrupt
end

thread.join

Current Behavior:

The TracePoint emits the thread_begin event but fails to emit the thread_end event when an exception is raised within the thread, indicating an incomplete tracing of thread lifecycle events.

I've confirmed this as far back as Ruby 2.6.

> ruby ./test.rb
[:thread_begin, 0, nil, nil, nil]
#<Thread:0x000000010384b5a8 ./test.rb:5 run> terminated with exception (report_on_exception is true):
./test.rb:6:in `block in <main>': Interrupt (Interrupt)
./test.rb:6:in `block in <main>': Interrupt (Interrupt)

Expected Behavior:

The TracePoint should emit both thread_begin and thread_end events, accurately reflecting the lifecycle of the thread, even when an exception is raised within the thread.

> ruby ./test.rb
[:thread_begin, 0, nil, nil, nil]
[:thread_end, 0, nil, nil, nil]
#<Thread:0x0000000105435db8 ./test.rb:5 run> terminated with exception (report_on_exception is true):
./test.rb:6:in 'block in <main>': Interrupt (Interrupt)
./test.rb:6:in 'block in <main>': Interrupt (Interrupt)

Possible Fix

Changing the implementation of thread_do_start to have what amounts to an "ensure" block.

static void
thread_do_start(rb_thread_t *th)
{
    native_set_thread_name(th);
    VALUE result = Qundef;

    rb_execution_context_t *ec = th->ec;
    int state;

    EXEC_EVENT_HOOK(ec, RUBY_EVENT_THREAD_BEGIN, th->self, 0, 0, 0, Qundef);

    EC_PUSH_TAG(ec);
    if ((state = EC_EXEC_TAG()) == TAG_NONE) {
        switch (th->invoke_type) {
        case thread_invoke_type_proc:
            result = thread_do_start_proc(th);
            break;

        case thread_invoke_type_ractor_proc:
            result = thread_do_start_proc(th);
            rb_ractor_atexit(th->ec, result);
            break;

        case thread_invoke_type_func:
            result = (*th->invoke_arg.func.func)(th->invoke_arg.func.arg);
            break;

        case thread_invoke_type_none:
            rb_bug("unreachable");
        }
    }

    EC_POP_TAG();
    VALUE errinfo = ec->errinfo;

    if (!NIL_P(errinfo) && !RB_TYPE_P(errinfo, T_OBJECT)) {
        ec->errinfo = Qnil;
    }

    rb_fiber_scheduler_set(Qnil);
    EXEC_EVENT_HOOK(th->ec, RUBY_EVENT_THREAD_END, th->self, 0, 0, 0, Qundef);

    ec->errinfo = errinfo;

    if (state)
        EC_JUMP_TAG(ec, state);

    th->value = result;
}

It's possible rb_fiber_scheduler_set(Qnil); can emit an exception itself. How do we write the code to handle that case?

Actions

Also available in: Atom PDF

Like1
Like0Like0Like0Like0Like0Like0Like0