Bug #21166
openFiber Scheduler is unable to be interrupted by `IO#close`.
Description
Background¶
Ruby's IO#close
can cause IO#read
, IO#write
, IO#wait
, IO#wait_readable
and IO#wait_writable
to be interrupted with an IOError: stream closed in another thread
. For reference, IO#select
cannot be interrupted in this way.
r, w = IO.pipe
thread = Thread.new do
r.read(1)
end
Thread.pass until thread.status == "sleep"
r.close
thread.join
# ./test.rb:6:in 'IO#read': stream closed in another thread (IOError)
Problem¶
The fiber scheduler provides hooks for io_read
, io_write
and io_wait
which are used by IO#read
, IO#write
, IO#wait
, IO#wait_readable
and IO#wait_writable
, but those hooks are not interrupted when IO#close
is invoked. That is because rb_notify_fd_close
is not scheduler aware, and the fiber scheduler is unable to register itself into the "waiting file descriptor" list.
#!/usr/bin/env ruby
require 'async'
r, w = IO.pipe
thread = Thread.new do
Async do
r.wait_readable
end
end
Thread.pass until thread.status == "sleep"
r.close
thread.join
In this test program, rb_notify_fd_close
will incorrectly terminate the entire fiber scheduler thread:
#<Thread:0x00007faa5b161bf8 /home/samuel/Developer/socketry/io-event/test.rb:7 run> terminated with exception (report_on_exception is true):
/home/samuel/Developer/socketry/io-event/lib/io/event/selector/select.rb:470:in 'IO.select': closed stream (IOError)
from /home/samuel/Developer/socketry/io-event/lib/io/event/selector/select.rb:470:in 'block in IO::Event::Selector::Select#select'
from /home/samuel/Developer/socketry/io-event/lib/io/event/selector/select.rb:468:in 'Thread.handle_interrupt'
from /home/samuel/Developer/socketry/io-event/lib/io/event/selector/select.rb:468:in 'IO::Event::Selector::Select#select'
from /home/samuel/.gem/ruby/3.4.1/gems/async-2.23.0/lib/async/scheduler.rb:396:in 'Async::Scheduler#run_once!'
...
Solution¶
We need a mechanism to ensure fibers are treated the same as threads, and interrupted correctly. We do this by:
- Introducing
VALUE rb_thread_io_interruptible_operation(VALUE self, VALUE(*function)(VALUE), VALUE argument)
which allows us to execute a callback that may be interrupted. Internally, this registers the current execution context into the existingwaiting_fds
list. - We update all the relevant fiber scheduler hooks to use
rb_thread_io_interruptible_operation
, e.g.io_wait
,io_read
,io_write
and so on. - We introduce
VALUE rb_fiber_scheduler_fiber_interrupt(VALUE scheduler, VALUE fiber, VALUE exception)
which can be used to interrupt a fiber, e.g. with an IOError exception. -
rb_notify_fd_close
is modified to correctly interrupt fibers using the new rb_fiber_scheduler_fiber_interrupt` function.
See https://github.com/ruby/ruby/pull/12839 for the proposed implementation.
Updated by luke-gru (Luke Gruber) 9 days ago
I'm not knowledgeable when it comes to the fiber scheduler, but why add this new method io.interruptible_operation
when you can accomplish the same thing by allowing the io operation to be interrupted by a close in every case like the regular thread scheduler does? Are there cases where you don't want the IO operation to be interrupted be a close?
Updated by ioquatix (Samuel Williams) 9 days ago
when you can accomplish the same thing by allowing the io operation to be interrupted by a close in every case like the regular thread scheduler does
Because the point at which IO operations become interruptible is not the entire operation, but usually the point at where the fiber yields back to the event loop, e.g. https://github.com/socketry/io-event/pull/130/files#diff-3e7a68b220a9360ead1d2e7a5ec23e7d36de591eab138721efdc61b565fc5194R552 - adding interrupts around the entire operation is unlikely to be safe nor desirable.
Maybe you can propose in more detail how your suggestion would work. I suppose you are suggesting to wrap the scheduler code in rb_fiber_scheduler_io_wait
and so on?
Updated by ioquatix (Samuel Williams) 9 days ago
· Edited
The fiber scheduler hook rb_fiber_scheduler_io_wait
could be implemented like this:
static VALUE
fiber_scheduler_io_wait(VALUE _argument) {
VALUE *arguments = (VALUE*)_argument;
return rb_funcallv(arguments[0], id_io_wait, 3, arguments + 1);
}
VALUE
rb_fiber_scheduler_io_wait(VALUE scheduler, VALUE io, VALUE events, VALUE timeout)
{
VALUE arguments[] = {
scheduler, io, events, timeout
};
return rb_io_interruptible_operation(io, fiber_scheduler_io_wait, (VALUE)&arguments);
}
I'll consider how to modify io_read
and io_write
hooks.
Updated by luke-gru (Luke Gruber) 8 days ago
Yeah, this makes more sense to me. The other way introduces a method on all IO
objects that's only useful in the fiber scheduler context, and I could see users forgetting
to call it. I think having it be the default is the more sensible approach, and it avoids an API.
Updated by ioquatix (Samuel Williams) 2 days ago
- Description updated (diff)
- Assignee set to ioquatix (Samuel Williams)
Updated by ioquatix (Samuel Williams) 2 days ago
I have updated the proposal based on the discussion.
Updated by ko1 (Koichi Sasada) 1 day ago
· Edited
could you make clear
-
rb_thread_io_interruptible_operation
why notrb_io
prefix? is it public c-api? - who/when/how unregister the hooks?
- could you clear how fiber scheduler use it? no
io_close
event (callback)? - which context the callback function is called?
Updated by ioquatix (Samuel Williams) 1 day ago
rb_thread_io_interruptible_operation why not rb_io prefix? is it public c-api?
It's not public API, and we can change the name. It is defined in thread.c
and thread.h
since that is where waiting_fd
is defined/used. waiting_fd
is not exposed outside of thread.c
.
who/when/how unregister the hooks?
rb_thread_io_interruptible_operation
uses a callback, so before entry to the callback, the operation is registered in waiting_fds
, and on exit from the callback, it is removed.
- Entry: https://github.com/ruby/ruby/pull/12839/files#diff-161b2a279f4c67a1ab075a7890ecf6f3f1d483d910910fa52b3715e25cfdcbd7R1753
- Exit: https://github.com/ruby/ruby/pull/12839/files#diff-161b2a279f4c67a1ab075a7890ecf6f3f1d483d910910fa52b3715e25cfdcbd7R1740
could you clear how fiber scheduler use it? no io_close event (callback)?
The fiber scheduler implementation does not need to be changed, since we added this to the scheduler.c
implementation.
which context the callback function is called?
It's used to wrap the execution of io_read
/io_write
and io_wait
scheduler hooks: https://github.com/ruby/ruby/pull/12839/files#diff-29a83910395dea702ac0c02b4cd9976ba252ff50f133d7cd8109beffbd2eab1d