Feature #18228: Add a `timeout` option to `IO.copy_stream` - Ruby - Ruby Issue Tracking System

Actions

Copy link

Feature #18228

open

Add a `timeout` option to `IO.copy_stream`

Feature #18228: Add a `timeout` option to `IO.copy_stream`

Added by byroot (Jean Boussier) over 4 years ago. Updated over 4 years ago.

Status:

Open

Assignee:

Target version:

[ruby-core:105450]

Description

Context¶

In many situations dealing with large files, IO.copy_stream when usable bring major performance gains (often twice faster at the very least). And more importantly, when the copying is deferred to the kernel, the performance is much more consistent as it is less impacted by the CPU utilization on the machine.

However, it is often unsafe to use because it doesn't have a timeout, so you can only use it if both the source and destination IOs are trusted, otherwise it is trivial for an attacker to DOS the service by reading the response very slowly.

Some examples¶

It is used by webrick.
Net::HTTP uses it to send request body if they are IOs, but it is used with a "fake IO" to allow for timeouts, so sendfile(2) &co are never used.
A proof of concept of integrating in puma shows a 2x speedup.
Various other HTTP client could use it as well.
I used it in private projects to download and upload large archives in and out of Google Cloud Storage with great effects.

Possible implementation¶

The main difficulty is that the underlying sycalls don't have a timeout either.

The main syscall used in these scenarios is sendfile(2). It doesn't have a timeout parameter, however if called on file descriptors with O_NONBLOCK it does return early and allow for a select/poll loop. I did a very quick and dirty experiment with this, and it does seem to work.

The other two accelerating syscalls are copy_file_range(2) (linux) and fcopyfile(2) (macOS). Neither have a timeout, and neither manpage document an EAGAIN / EWOULDBLOCK error. However these syscalls are limited to real file copies, generally speaking timeouts for real files are less of a critical need, so it would be possible to simply not use these syscalls if a timeout is provided.

Interface¶

copy_stream(src, dst, copy_length, src_offset, timeout)
or copy_stream(src, dst, copy_length, src_offset, timeout: nil)

As for the return value in case of a timeout, it is important to convey both that a timeout happened, and the number of bytes that were copied, otherwise it makes retries impossible.

It could simply returns the number of byte, and let the caller compare it to the expected number of bytes copied, but that wouldn't work in cases where the size of src isn't known.
It could return -1 - bytes_copied, not particularly elegant but would work.
It could return multiple values or some kind of result object when a timeout is provided.
It could raise an error, with bytes_copied as an attribute on the error.

Or alternatively copy_stream would be left without a timeout, and some kind of copy_stream2 would be introduced so that copy_stream return value wouldn't be made inconsistent.

Related issues 1 (0 open — 1 closed)

Actions

Copy link

Also available in: PDF Atom

Project

General

Profile

Ruby

Custom queries

Feature #18228

Add a `timeout` option to `IO.copy_stream`

Context¶

Some examples¶

Possible implementation¶

Interface¶

Updated by byroot (Jean Boussier) over 4 years ago Actions
Copy link
#1

Updated by Eregon (Benoit Daloze) over 4 years ago Actions
Copy link
#2 [ruby-core:105451]

Updated by ioquatix (Samuel Williams) over 4 years ago Actions
Copy link
#3 [ruby-core:105453]

Updated by byroot (Jean Boussier) over 4 years ago Actions
Copy link
#4 [ruby-core:105454]

Updated by Eregon (Benoit Daloze) over 4 years ago Actions
Copy link
#5 [ruby-core:105463]

Updated by byroot (Jean Boussier) over 4 years ago Actions
Copy link
#6 [ruby-core:105464]

Updated by ioquatix (Samuel Williams) over 4 years ago Actions
Copy link
#7 [ruby-core:105502]

Updated by byroot (Jean Boussier) over 4 years ago Actions
Copy link
#8 [ruby-core:105503]

Updated by Eregon (Benoit Daloze) over 4 years ago Actions
Copy link
#9 [ruby-core:105514]

Updated by normalperson (Eric Wong) over 4 years ago Actions
Copy link
#10 [ruby-core:105517]

Updated by normalperson (Eric Wong) over 4 years ago Actions
Copy link
#11 [ruby-core:105518]

Project

General

Profile

Ruby

Custom queries

Feature #18228

Add a `timeout` option to `IO.copy_stream`

Context¶

Some examples¶

Possible implementation¶

Interface¶

Updated by byroot (Jean Boussier) over 4 years ago ActionsCopy link #1

Updated by Eregon (Benoit Daloze) over 4 years ago ActionsCopy link #2 [ruby-core:105451]

Updated by ioquatix (Samuel Williams) over 4 years ago ActionsCopy link #3 [ruby-core:105453]

Updated by byroot (Jean Boussier) over 4 years ago ActionsCopy link #4 [ruby-core:105454]

Updated by Eregon (Benoit Daloze) over 4 years ago ActionsCopy link #5 [ruby-core:105463]

Updated by byroot (Jean Boussier) over 4 years ago ActionsCopy link #6 [ruby-core:105464]

Updated by ioquatix (Samuel Williams) over 4 years ago ActionsCopy link #7 [ruby-core:105502]

Updated by byroot (Jean Boussier) over 4 years ago ActionsCopy link #8 [ruby-core:105503]

Updated by Eregon (Benoit Daloze) over 4 years ago ActionsCopy link #9 [ruby-core:105514]

Updated by normalperson (Eric Wong) over 4 years ago ActionsCopy link #10 [ruby-core:105517]

Updated by normalperson (Eric Wong) over 4 years ago ActionsCopy link #11 [ruby-core:105518]

Updated by byroot (Jean Boussier) over 4 years ago Actions
Copy link
#1

Updated by Eregon (Benoit Daloze) over 4 years ago Actions
Copy link
#2 [ruby-core:105451]

Updated by ioquatix (Samuel Williams) over 4 years ago Actions
Copy link
#3 [ruby-core:105453]

Updated by byroot (Jean Boussier) over 4 years ago Actions
Copy link
#4 [ruby-core:105454]

Updated by Eregon (Benoit Daloze) over 4 years ago Actions
Copy link
#5 [ruby-core:105463]

Updated by byroot (Jean Boussier) over 4 years ago Actions
Copy link
#6 [ruby-core:105464]

Updated by ioquatix (Samuel Williams) over 4 years ago Actions
Copy link
#7 [ruby-core:105502]

Updated by byroot (Jean Boussier) over 4 years ago Actions
Copy link
#8 [ruby-core:105503]

Updated by Eregon (Benoit Daloze) over 4 years ago Actions
Copy link
#9 [ruby-core:105514]

Updated by normalperson (Eric Wong) over 4 years ago Actions
Copy link
#10 [ruby-core:105517]

Updated by normalperson (Eric Wong) over 4 years ago Actions
Copy link
#11 [ruby-core:105518]