Bug #18455: `IO#close` has poor performance and difficult to understand semantics. - Ruby - Ruby Issue Tracking System

Actions

Copy link

Bug #18455

closed

`IO#close` has poor performance and difficult to understand semantics.

Bug #18455: `IO#close` has poor performance and difficult to understand semantics.

Added by ioquatix (Samuel Williams) over 4 years ago. Updated about 1 year ago.

Status:

Closed

Assignee:

ioquatix (Samuel Williams)

Target version:

ruby -v:

Backport:

2.6: UNKNOWN, 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN

[ruby-core:106939]

Description

IO#close should be responsible for closing the file descriptor referred to by the IO instance. When dealing with buffered IO, one can also expect this to flush the internal buffers if possible.

Currently, all blocking IO operations release the GVL and perform the blocking system call using rb_thread_io_blocking_region. The current implementation takes a file descriptor and adds an entry to the VM global waiting_fds list. When the operation is completed, the entry is removed from waiting_fds.

When calling IO#close, this list is traversed and any threads performing blocking operations with a matching file descriptor are interrupted. The performance of this is O(number of blocking IO operations) which in practice the performance of IO#close can take milliseconds with 10,000 threads performing blocking IO. This performance is unacceptable.

#!/usr/bin/env ruby

require 'benchmark'

class Reading
  def initialize
    @r, @w = IO.pipe

    @thread = Thread.new do
      @r.read
    rescue IOError
      # Ignore.
    end
  end

  attr :r
  attr :w

  attr :thread

  def join
    @thread.join
  end
end

def measure(count = 10)
  readings = count.times.map do
    Reading.new
  end

  sleep 10

  duration = Benchmark.measure do
    readings.each do |reading|
      reading.r.close
      reading.w.close
    end
  end

  average = (duration.total / count) * 1000.0
  pp count: count, average: sprintf("%0.2fms", average)

  readings.each(&:join)
end

measure(   10)
measure(  100)
measure( 1000)
measure(10000)

In addition, the semantics of this operation are confusing at best. While Ruby programs are dealing with IO instances, the VM is dealing with file descriptors, in effect performing some internal de-duplication of IO state. In practice, this leads to strange behaviour:

#!/usr/bin/env ruby

r, w = IO.pipe
r2 = IO.for_fd(r.to_i)
pp r: r, r2: r2

t = Thread.new do
  r2.read rescue nil
  r2.read # EBADF
end

sleep 0.5
r.close
t.join rescue nil

pp r: r, r2: r2
# r is closed, r2 is valid but will raise EBADF on any operation.

In addition, this confusing behaviour extends to Ractor and state is leaked between the two:

r, w = IO.pipe

ractor = Ractor.new(r.to_i) do |fd|
  r2 = IO.for_fd(fd)
  r2.read
  # r2.read # EBADF
end

sleep 0.5
r.close

pp take: ractor.take

I propose the following changes to simplify the semantics and improve performance:

Move the semantics of waiting_fds from per-fd to per-IO. This means that IO#close only interrupts blocking operations performed on the same IO instance rather than ANY IO which refers to the same file descriptor. I think this behaviour is easier to understand and still protects against the vast majority of incorrect usage.
Move the details of struct rb_io_t to internal/io.h so that the implementation details are not part of the public interface.

Benchmarks¶

Before:

{:count=>10, :average=>"0.19ms"}
{:count=>100, :average=>"0.11ms"}
{:count=>1000, :average=>"0.18ms"}
{:count=>10000, :average=>"1.16ms"}

After:

{:count=>10, :average=>"0.20ms"}
{:count=>100, :average=>"0.11ms"}
{:count=>1000, :average=>"0.15ms"}
{:count=>10000, :average=>"0.68ms"}

After investigating this further I found that the rb_thread_io_blocking_region using ubf_select can be incredibly slow, proportional to the number of threads. I don't know whether it's advisable but:

        BLOCKING_REGION(blocking_node.thread, {
            val = func(data1);
            saved_errno = errno;
        }, NULL /* ubf_select */, blocking_node.thread, FALSE);

Disabling the UBF function and relying on read(fd, ...)/write(fd, ...) blocking operations to fail when close(fd) is invoked might be sufficient? This needs more investigation but after making this change, we have constant-time IO#close.

{:count=>10, :average=>"0.13ms"}
{:count=>100, :average=>"0.06ms"}
{:count=>1000, :average=>"0.04ms"}
{:count=>10000, :average=>"0.09ms"}

Which is ideally what we want.

Related issues 2 (2 open — 0 closed)

Updated by ioquatix (Samuel Williams) over 4 years ago Actions
Copy link
#1 [ruby-core:106940]

Here is the PR: https://github.com/ruby/ruby/pull/5384

Updated by Eregon (Benoit Daloze) over 4 years ago Actions
Copy link
#2 [ruby-core:106941]

Nice, I thought this logic was already per IO instance.

I don't see how it will change anything about this leads to strange behaviour though.
Which is fine because it seems unavoidable as long as IO.for_fd(fd) exists.

IO.for_fd(fd) does not dup(2) and so EBADF is expected if the original IO of fd is closed.
IO.for_fd(fd) is unsafe in general, and is only OK when: 1) autoclose = false is called on it; 2) the original IO is alive longer and not closed for all usages of IO.for_fd(fd) for that fd.
For Ractor purposes it's a lot safer to Racter#send(io, move: true) rather than passing the fd and IO.for_fd(fd).

Updated by Eregon (Benoit Daloze) over 4 years ago Actions
Copy link
#3 [ruby-core:106944]

Ah, the difference is thread t won't be interrupted anymore, but close(2) probably stops the blocking read.
@ioquatix (Samuel Williams) So what's the result of the two snippets after this patch?

Updated by ioquatix (Samuel Williams) over 4 years ago Actions
Copy link
#4 [ruby-core:108141]

@shyouhei (Shyouhei Urabe) what was the point of https://github.com/ioquatix/ruby/commit/b121cfde5fbc8cd20684a5fd94047f40323a8353 - Performance? Consistency? Something else?

Even though it's called rb_io_fptr_finalize_internal it now seems part of public Ruby API? Can you confirm that was your desire? Like, basically how is it possible to see this error: https://github.com/ruby/ruby/runs/5770272595?check_suite_focus=true#step:16:93

 /home/runner/work/ruby/ruby/src/tool/lib/test/unit/parallel.rb: TestFileUtils#test_cp_r_socket: symbol lookup error: /home/runner/work/ruby/ruby/build/.ext/x86_64-linux/socket.so: undefined symbol: rb_io_fptr_finalize_internal

Maybe my fault, but I'm seeing:

In file included from ../src/process.c:102:
../src/internal.h:67:9: error: 'rb_io_fptr_finalize' macro redefined [-Werror,-Wmacro-redefined]
#define rb_io_fptr_finalize(...) rb_nonexistent_symbol(__VA_ARGS__)
        ^
../src/internal/io.h:147:9: note: previous definition is here
#define rb_io_fptr_finalize rb_io_fptr_finalize_internal
        ^
1 error generated.

and similar errors.

Updated by shyouhei (Shyouhei Urabe) over 4 years ago Actions
Copy link
#5 [ruby-core:108168]

ioquatix (Samuel Williams) wrote in #note-4:

@shyouhei (Shyouhei Urabe) what was the point of https://github.com/ioquatix/ruby/commit/b121cfde5fbc8cd20684a5fd94047f40323a8353 - Performance? Consistency? Something else?

Hello, rb_io_fptr_finalize_internal this is a very tiny peephole optimisation to omit return value when possible. I didn't indent to make it public. If it is now that isn't what I want.

Updated by mame (Yusuke Endoh) almost 4 years ago Actions
Copy link
#6

Related to Bug #19039: Closing an IO being select'ed in another thread does not resume the thread added

Updated by ioquatix (Samuel Williams) almost 4 years ago Actions
Copy link
#7 [ruby-core:110237]

I didn't realise it but I already filed a bug for a race condition in this behaviour too: https://bugs.ruby-lang.org/issues/14681

Updated by Eregon (Benoit Daloze) almost 4 years ago Actions
Copy link
#8

Related to Bug #14681: `syswrite': stream closed in another thread (IOError) added

Updated by ioquatix (Samuel Williams) over 2 years ago Actions
Copy link
#9 [ruby-core:117308]

After reviewing async-io, it looks like wait_readable and wait_writable might not be interrupted by close, leading to some odd behaviour.

Updated by ioquatix (Samuel Williams) over 2 years ago Actions
Copy link
#10 [ruby-core:117487]

Assignee set to ioquatix (Samuel Williams)

Updated by hsbt (Hiroshi SHIBATA) almost 2 years ago Actions
Copy link
#11

Status changed from Open to Assigned

Updated by ioquatix (Samuel Williams) about 1 year ago Actions
Copy link
#12 [ruby-core:122248]

Status changed from Assigned to Closed

After merging https://github.com/ruby/ruby/pull/12839, we see nice performance improvements and the semantics are well documented.

> make benchmark BENCH_RUBY=/Users/samuel/.rubies/ruby-head/bin/ruby COMPARE_RUBY=/Users/samuel/.rubies/ruby-a6435bef/bin/ruby ITEM=io_close
/Users/samuel/.rubies/ruby-3.4.2/bin/ruby --disable=gems -rrubygems -I./benchmark/lib ./benchmark/benchmark-driver/exe/benchmark-driver \
	            --executables="compare-ruby::/Users/samuel/.rubies/ruby-a6435bef/bin/ruby -I.ext/common --disable-gem" \
	            --executables="built-ruby::/Users/samuel/.rubies/ruby-head/bin/ruby --disable-gem" \
	            --output=markdown --output-compare -v $(find ./benchmark -maxdepth 1 -name 'io_close' -o -name '*io_close*.yml' -o -name '*io_close*.rb' | sort) 
compare-ruby: ruby 3.5.0dev (2025-05-13T08:35:34Z master a6435befa7) +PRISM [arm64-darwin24]
built-ruby: ruby 3.5.0dev (2025-05-17T00:42:12Z master aa0f689bf4) +PRISM [arm64-darwin24]
last_commit=[ruby/net-http] Freeze some constants to improve Ractor compatibility
# Iteration per second (i/s)

|                    |compare-ruby|built-ruby|
|:-------------------|-----------:|---------:|
|io_close_contended  |     117.394|   179.029|
|                    |           -|     1.53x|
|io_close            |     821.862|   900.114|
|                    |           -|     1.10x|

Actions

Copy link

Also available in: PDF Atom

Project

General

Profile

Ruby

Custom queries

Bug #18455

`IO#close` has poor performance and difficult to understand semantics.

Benchmarks¶

Updated by ioquatix (Samuel Williams) over 4 years ago Actions
Copy link
#1 [ruby-core:106940]

Updated by Eregon (Benoit Daloze) over 4 years ago Actions
Copy link
#2 [ruby-core:106941]

Updated by Eregon (Benoit Daloze) over 4 years ago Actions
Copy link
#3 [ruby-core:106944]

Updated by ioquatix (Samuel Williams) over 4 years ago Actions
Copy link
#4 [ruby-core:108141]

Updated by shyouhei (Shyouhei Urabe) over 4 years ago Actions
Copy link
#5 [ruby-core:108168]

Updated by mame (Yusuke Endoh) almost 4 years ago Actions
Copy link
#6

Updated by ioquatix (Samuel Williams) almost 4 years ago Actions
Copy link
#7 [ruby-core:110237]

Updated by Eregon (Benoit Daloze) almost 4 years ago Actions
Copy link
#8

Updated by ioquatix (Samuel Williams) over 2 years ago Actions
Copy link
#9 [ruby-core:117308]

Updated by ioquatix (Samuel Williams) over 2 years ago Actions
Copy link
#10 [ruby-core:117487]

Updated by hsbt (Hiroshi SHIBATA) almost 2 years ago Actions
Copy link
#11

Updated by ioquatix (Samuel Williams) about 1 year ago Actions
Copy link
#12 [ruby-core:122248]

Project

General

Profile

Ruby

Custom queries

Bug #18455

`IO#close` has poor performance and difficult to understand semantics.

Benchmarks¶

Updated by ioquatix (Samuel Williams) over 4 years ago ActionsCopy link #1 [ruby-core:106940]

Updated by Eregon (Benoit Daloze) over 4 years ago ActionsCopy link #2 [ruby-core:106941]

Updated by Eregon (Benoit Daloze) over 4 years ago ActionsCopy link #3 [ruby-core:106944]

Updated by ioquatix (Samuel Williams) over 4 years ago ActionsCopy link #4 [ruby-core:108141]

Updated by shyouhei (Shyouhei Urabe) over 4 years ago ActionsCopy link #5 [ruby-core:108168]

Updated by mame (Yusuke Endoh) almost 4 years ago ActionsCopy link #6

Updated by ioquatix (Samuel Williams) almost 4 years ago ActionsCopy link #7 [ruby-core:110237]

Updated by Eregon (Benoit Daloze) almost 4 years ago ActionsCopy link #8

Updated by ioquatix (Samuel Williams) over 2 years ago ActionsCopy link #9 [ruby-core:117308]

Updated by ioquatix (Samuel Williams) over 2 years ago ActionsCopy link #10 [ruby-core:117487]

Updated by hsbt (Hiroshi SHIBATA) almost 2 years ago ActionsCopy link #11

Updated by ioquatix (Samuel Williams) about 1 year ago ActionsCopy link #12 [ruby-core:122248]

Updated by ioquatix (Samuel Williams) over 4 years ago Actions
Copy link
#1 [ruby-core:106940]

Updated by Eregon (Benoit Daloze) over 4 years ago Actions
Copy link
#2 [ruby-core:106941]

Updated by Eregon (Benoit Daloze) over 4 years ago Actions
Copy link
#3 [ruby-core:106944]

Updated by ioquatix (Samuel Williams) over 4 years ago Actions
Copy link
#4 [ruby-core:108141]

Updated by shyouhei (Shyouhei Urabe) over 4 years ago Actions
Copy link
#5 [ruby-core:108168]

Updated by mame (Yusuke Endoh) almost 4 years ago Actions
Copy link
#6

Updated by ioquatix (Samuel Williams) almost 4 years ago Actions
Copy link
#7 [ruby-core:110237]

Updated by Eregon (Benoit Daloze) almost 4 years ago Actions
Copy link
#8

Updated by ioquatix (Samuel Williams) over 2 years ago Actions
Copy link
#9 [ruby-core:117308]

Updated by ioquatix (Samuel Williams) over 2 years ago Actions
Copy link
#10 [ruby-core:117487]

Updated by hsbt (Hiroshi SHIBATA) almost 2 years ago Actions
Copy link
#11

Updated by ioquatix (Samuel Williams) about 1 year ago Actions
Copy link
#12 [ruby-core:122248]