Project

General

Profile

Actions

Bug #21166

open

Fiber Scheduler is unable to be interrupted by `IO#close`.

Added by ioquatix (Samuel Williams) 10 days ago. Updated 1 day ago.


Description

Background

Ruby's IO#close can cause IO#read, IO#write, IO#wait, IO#wait_readable and IO#wait_writable to be interrupted with an IOError: stream closed in another thread. For reference, IO#select cannot be interrupted in this way.

r, w = IO.pipe

thread = Thread.new do
  r.read(1)
end

Thread.pass until thread.status == "sleep"

r.close

thread.join
# ./test.rb:6:in 'IO#read': stream closed in another thread (IOError)

Problem

The fiber scheduler provides hooks for io_read, io_write and io_wait which are used by IO#read, IO#write, IO#wait, IO#wait_readable and IO#wait_writable, but those hooks are not interrupted when IO#close is invoked. That is because rb_notify_fd_close is not scheduler aware, and the fiber scheduler is unable to register itself into the "waiting file descriptor" list.

#!/usr/bin/env ruby

require 'async'

r, w = IO.pipe

thread = Thread.new do
  Async do
    r.wait_readable
  end
end

Thread.pass until thread.status == "sleep"

r.close

thread.join

In this test program, rb_notify_fd_close will incorrectly terminate the entire fiber scheduler thread:

#<Thread:0x00007faa5b161bf8 /home/samuel/Developer/socketry/io-event/test.rb:7 run> terminated with exception (report_on_exception is true):
/home/samuel/Developer/socketry/io-event/lib/io/event/selector/select.rb:470:in 'IO.select': closed stream (IOError)
  from /home/samuel/Developer/socketry/io-event/lib/io/event/selector/select.rb:470:in 'block in IO::Event::Selector::Select#select'
  from /home/samuel/Developer/socketry/io-event/lib/io/event/selector/select.rb:468:in 'Thread.handle_interrupt'
  from /home/samuel/Developer/socketry/io-event/lib/io/event/selector/select.rb:468:in 'IO::Event::Selector::Select#select'
  from /home/samuel/.gem/ruby/3.4.1/gems/async-2.23.0/lib/async/scheduler.rb:396:in 'Async::Scheduler#run_once!'
...

Solution

We need a mechanism to ensure fibers are treated the same as threads, and interrupted correctly. We do this by:

  1. Introducing VALUE rb_thread_io_interruptible_operation(VALUE self, VALUE(*function)(VALUE), VALUE argument) which allows us to execute a callback that may be interrupted. Internally, this registers the current execution context into the existing waiting_fds list.
  2. We update all the relevant fiber scheduler hooks to use rb_thread_io_interruptible_operation, e.g. io_wait, io_read, io_write and so on.
  3. We introduce VALUE rb_fiber_scheduler_fiber_interrupt(VALUE scheduler, VALUE fiber, VALUE exception) which can be used to interrupt a fiber, e.g. with an IOError exception.
  4. rb_notify_fd_close is modified to correctly interrupt fibers using the new rb_fiber_scheduler_fiber_interrupt` function.

See https://github.com/ruby/ruby/pull/12839 for the proposed implementation.

Actions #1

Updated by ioquatix (Samuel Williams) 10 days ago

  • Description updated (diff)

Updated by luke-gru (Luke Gruber) 10 days ago

I'm not knowledgeable when it comes to the fiber scheduler, but why add this new method io.interruptible_operation when you can accomplish the same thing by allowing the io operation to be interrupted by a close in every case like the regular thread scheduler does? Are there cases where you don't want the IO operation to be interrupted be a close?

Updated by ioquatix (Samuel Williams) 10 days ago

when you can accomplish the same thing by allowing the io operation to be interrupted by a close in every case like the regular thread scheduler does

Because the point at which IO operations become interruptible is not the entire operation, but usually the point at where the fiber yields back to the event loop, e.g. https://github.com/socketry/io-event/pull/130/files#diff-3e7a68b220a9360ead1d2e7a5ec23e7d36de591eab138721efdc61b565fc5194R552 - adding interrupts around the entire operation is unlikely to be safe nor desirable.

Maybe you can propose in more detail how your suggestion would work. I suppose you are suggesting to wrap the scheduler code in rb_fiber_scheduler_io_wait and so on?

Updated by ioquatix (Samuel Williams) 9 days ago · Edited

The fiber scheduler hook rb_fiber_scheduler_io_wait could be implemented like this:

static VALUE
fiber_scheduler_io_wait(VALUE _argument) {
    VALUE *arguments = (VALUE*)_argument;

    return rb_funcallv(arguments[0], id_io_wait, 3, arguments + 1);
}

VALUE
rb_fiber_scheduler_io_wait(VALUE scheduler, VALUE io, VALUE events, VALUE timeout)
{
    VALUE arguments[] = {
        scheduler, io, events, timeout
    };

    return rb_io_interruptible_operation(io, fiber_scheduler_io_wait, (VALUE)&arguments);
}

I'll consider how to modify io_read and io_write hooks.

Updated by luke-gru (Luke Gruber) 9 days ago

Yeah, this makes more sense to me. The other way introduces a method on all IO objects that's only useful in the fiber scheduler context, and I could see users forgetting
to call it. I think having it be the default is the more sensible approach, and it avoids an API.

Updated by ioquatix (Samuel Williams) 2 days ago

  • Description updated (diff)
  • Assignee set to ioquatix (Samuel Williams)
Actions #7

Updated by ioquatix (Samuel Williams) 2 days ago

  • Description updated (diff)

Updated by ioquatix (Samuel Williams) 2 days ago

I have updated the proposal based on the discussion.

Updated by ko1 (Koichi Sasada) 1 day ago · Edited

could you make clear

  • rb_thread_io_interruptible_operation why not rb_io prefix? is it public c-api?
  • who/when/how unregister the hooks?
  • could you clear how fiber scheduler use it? no io_close event (callback)?
  • which context the callback function is called?

Updated by ioquatix (Samuel Williams) 1 day ago

rb_thread_io_interruptible_operation why not rb_io prefix? is it public c-api?

It's not public API, and we can change the name. It is defined in thread.c and thread.h since that is where waiting_fd is defined/used. waiting_fd is not exposed outside of thread.c.

who/when/how unregister the hooks?

rb_thread_io_interruptible_operation uses a callback, so before entry to the callback, the operation is registered in waiting_fds, and on exit from the callback, it is removed.

could you clear how fiber scheduler use it? no io_close event (callback)?

The fiber scheduler implementation does not need to be changed, since we added this to the scheduler.c implementation.

which context the callback function is called?

It's used to wrap the execution of io_read/io_write and io_wait scheduler hooks: https://github.com/ruby/ruby/pull/12839/files#diff-29a83910395dea702ac0c02b4cd9976ba252ff50f133d7cd8109beffbd2eab1d

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0