Bug #21818
openThread backtraces cannot be communicated over Ractor ports
Description
It looks like Thread::Backtrace is confined to the Ractor that produced it: it cannot be copied or moved to another Ractor, nor can it be made shareable. This makes it difficult for a Ractor to communicate exceptions (e.g. the reason for its own failure) to other Ractors. Is this intentional? I would have expected that a Thread::Backtrace is just static data that should not have problems moving between Ractors.
Details below:
Thread::Backtrace cannot be moved to another Ractor
Code:
def make_trace
caller_locations
end
r1 = Ractor.new do
port = receive
trace = make_trace
puts "**** Original: #{trace.inspect}"
port.send(trace, move: true) # Fails here
end
port = Ractor::Port.new
r1.send(port)
trace = port.receive # Hangs here
puts "**** Received: #{trace.inspect}"
Result:
**** Original: ["hello.rb:7:in 'block in <main>'"]
#<Thread:0x000000010438f770 run> terminated with exception (report_on_exception is true):
hello.rb:9:in 'Ractor::Port#send': can not move Thread::Backtrace::Location object. (Ractor::Error)
from hello.rb:9:in 'block in <main>'
As a result, it is also not possible to move an Exception:
r1 = Ractor.new do
port = receive
begin
raise "hello"
rescue => e
port.send(e, move: true) # Fails here
end
end
port = Ractor::Port.new
r1.send(port)
e = port.receive # Hangs here
Result:
#<Thread:0x0000000102a0f570 run> terminated with exception (report_on_exception is true):
hello.rb:6:in 'Ractor::Port#send': can not move Thread::Backtrace object. (Ractor::Error)
from hello.rb:6:in 'block in <main>'
Note that Kernel#caller returns a string array that can be moved successfully:
def make_trace
caller
end
r1 = Ractor.new do
port = receive
trace = make_trace
puts "**** Original: #{trace.inspect}"
port.send(trace, move: true)
end
port = Ractor::Port.new
r1.send(port)
trace = port.receive
puts "**** Received: #{trace.inspect}"
Result:
**** Original: ["hello.rb:7:in 'block in <main>'"]
**** Received: ["hello.rb:7:in 'block in <main>'"]
Thread::Backtrace cannot be copied to another Ractor either
Code:
def make_trace
caller_locations
end
r1 = Ractor.new do
port = receive
trace = make_trace
puts "**** Original: #{trace.inspect}"
port.send(trace, move: false) # Fails here
end
port = Ractor::Port.new
r1.send(port)
trace = port.receive # Hangs here
puts "**** Received: #{trace.inspect}"
Result. Note the specific error message "allocator undefined" that might give a clue why this is all happening:
**** Original: ["hello.rb:7:in 'block in <main>'"]
#<Thread:0x000000010031f5c8 run> terminated with exception (report_on_exception is true):
hello.rb:9:in 'Ractor::Port#send': allocator undefined for Thread::Backtrace::Location (TypeError)
port.send(trace, move: false)
^^^^^^^^^^^^^^^^^^
from hello.rb:9:in 'block in <main>'
Again, Kernel#caller returns a string array that copies successfully:
def make_trace
caller
end
r1 = Ractor.new do
port = receive
trace = make_trace
puts "**** Original: #{trace.inspect}"
port.send(trace, move: false) # Fails here
end
port = Ractor::Port.new
r1.send(port)
trace = port.receive # Hangs here
puts "**** Received: #{trace.inspect}"
Result:
**** Original: ["hello.rb:7:in 'block in <main>'"]
**** Received: ["hello.rb:7:in 'block in <main>'"]
Interestingly, if an exception is copied when sent over a port, it doesn't fail with an undefined allocator, but the backtrace in the copy is empty:
r1 = Ractor.new do
port = receive
begin
raise "hello"
rescue => e
puts "**** Original: #{e.backtrace.inspect}"
port.send(e, move: false)
end
end
port = Ractor::Port.new
r1.send(port)
e = port.receive
puts "**** Received: #{e.backtrace.inspect}"
Result:
**** Original: ["hello.rb:4:in 'block in <main>'"]
**** Received: []
Thread::Backtrace cannot be made shareable
Code:
def make_trace
caller_locations
end
trace = Ractor.make_shareable(make_trace) # Fails here
Result:
hello.rb:5:in 'Ractor.make_shareable': can not make shareable object for "hello.rb:5:in '<main>'" (Ractor::Error)
from hello.rb:5:in '<main>'
As a result, exceptions also cannot be made shareable:
begin
raise "hello"
rescue => e
e = Ractor.make_shareable(e) # Fails here
end
Result:
hello.rb: can not make shareable object for #<Thread::Backtrace:0x0000000102fdfe28> (Ractor::Error)
Interestingly, if the backtrace is empty, make_shareable succeeds, suggesting that it's specifically the Thread::Backtrace::Location that cannot be made shareable:
trace = Ractor.make_shareable(caller_locations)
puts trace.inspect
Result:
[]
And, of course, the result of Kernel#caller can be made shareable:
def make_trace
caller
end
trace = Ractor.make_shareable(make_trace)
puts trace.inspect
Result:
["hello.rb:5:in '<main>'"]
Updated by byroot (Jean Boussier) 4 days ago
Is this intentional?
I don't think so, more of an oversight. Looking at the implementation, I don't see a reason why it couldn't be marked as FROZEN_SHAREABLE.
Updated by byroot (Jean Boussier) 4 days ago
https://github.com/ruby/ruby/pull/15790 fixes the make_shareable. As for sending, I'm not sure if there is a way to allow an object to be copied without defining an allocator function.
Updated by byroot (Jean Boussier) 4 days ago
https://github.com/ruby/ruby/pull/15790 fixes the make_shareable.
Actually, might be more tricky than that. Backtrace object have two lazily instantiated mutable arrays (backtrace and backtrace_locations). So if we mark them as FROZEN_SHAREABLE we probably need to instantiate these arrays as frozen too?
I wonder what @ko1 (Koichi Sasada) @tenderlovemaking (Aaron Patterson), @jhawthorn (John Hawthorn) @luke-gru (Luke Gruber) think?
Updated by byroot (Jean Boussier) 4 days ago
I'm not sure if there is a way to allow an object to be copied without defining an allocator function.
So I experimented with this a bit. It wouldn't be hard at all to disable Class#allocate while still allowing rb_obj_dup etc to work.
However for Backtrace there is an extra challenge that Backtrace objects do rely on variable width allocation. And the issue with rb_obj_dup is it just blindly call rb_obj_alloc(rb_obj_class(obj)). There is no consideration of how large the copy would need to be etc.
So I'm afraid there's quite a bit of research to do around here to better support embedded TypedData.
Updated by byroot (Jean Boussier) 4 days ago
I have another PR that allow Backtrace objects to be duped, hence to send them across Ractors: https://github.com/ruby/ruby/pull/15795