Feature #5605
closed[PATCH] net/http: use IO.copy_stream for requests using body_stream
Description
This significantly reduces both user and system CPU usage in the
client while making uploads. When using plain HTTP with a
known Content-Length, IO.copy_stream may also use sendfile() to
further reduce user CPU time.
I tested using the following script to upload a 1 gigabyte file.
------------------- net-http-upload.rb -------------------------
require "net/http"
require "benchmark"
require "tempfile"
tmp = Tempfile.new("sparse")
tmp.sysseek(1024 * 1024 * 1024) # 1 gigabyte
tmp.syswrite(".")
host = "127.0.0.1"
port = 1234
res = nil
Net::HTTP.start(host, port) do |http|
put = Net::HTTP::Put.new("/")
tmp.sysseek(0)
put.body_stream = tmp
put.content_length = tmp.size
puts "with Content-Length: #{tmp.size}"
bm = Benchmark.measure { res = http.request(put) }
p [ res, res.body ]
printf("utime=%0.3f stime=%0.3f\n", bm.utime, bm.stime)
end
Net::HTTP.start(host, port) do |http|
put = Net::HTTP::Put.new("/")
tmp.sysseek(0)
put.body_stream = tmp
put["Transfer-Encoding"] = "chunked"
puts "with Transfer-Encoding: chunked"
bm = Benchmark.measure { res = http.request(put) }
p [ res, res.body ]
printf("utime=%0.3f stime=%0.3f\n", bm.utime, bm.stime)
end
------------------------ Results -------------------------------
before¶
with Content-Length: 1073741825
[#<Net::HTTPOK 200 OK readbody=true>,
"dcdd67a8f07b73796c0485890d48fa697966d09f\n"]
utime=2.660 stime=1.680
with Transfer-Encoding: chunked
[#<Net::HTTPOK 200 OK readbody=true>,
"dcdd67a8f07b73796c0485890d48fa697966d09f\n"]
utime=6.230 stime=1.900
after¶
with Content-Length: 1073741825
[#<Net::HTTPOK 200 OK readbody=true>,
"dcdd67a8f07b73796c0485890d48fa697966d09f\n"]
utime=0.010 stime=0.410
with Transfer-Encoding: chunked
[#<Net::HTTPOK 200 OK readbody=true>,
"dcdd67a8f07b73796c0485890d48fa697966d09f\n"]
utime=0.320 stime=0.620
----------------------- Server setup --------------------------
Any server that'll accept chunked uploads works, I'm most
familiar with unicorn. I used unicorn 4.1.1 with
rewindable_input disabled, running t/sha1.ru from the Rainbows!
source tree. t/sha1.ru returns the SHA1 of the uploaded body to
verify the upload is sending correct data.
$ git clone git://bogomips.org/rainbows
$ unicorn -E none -p 1234 -c unicorn.conf rainbows/t/sha1.ru
unicorn.conf.rb only had one line to disable rewindable input:
rewindable_input false
This was to prevent disk I/O from slowing the overall runtime
of the test, it doesn't have a significant effect on measured
CPU time in the client process even on the same box.
Files
Updated by drbrain (Eric Hodel) almost 13 years ago
=begin
For the chunked? half, would looping over (({IO.copy_stream f, sock.io, 1024})) be faster?
=end
Updated by normalperson (Eric Wong) almost 13 years ago
Eric Hodel drbrain@segment7.net wrote:
For the chunked? half, would looping over
(({IO.copy_stream f, sock.io, 1024})) be faster?
Yes, but its a rare case, needs to depend on a non-portable ioctl(),
and isn't much faster.
You can't blindly specify 1024 bytes because 'f' isn't guaranteed to
have 1024 readable. But you need to write the "%x\r\n" size
header before calling copy_stream, so I'm using IO#nread.
Chunked uploads only make sense if:
- f is a pipe/socket
- you want to generate something like a Content-MD5 Trailer:
on-the-fly and don't want to read f twice.
In either case, sendfile() can't be used by IO.copy_stream. We can
teach IO.copy_stream to splice() on Linux, maybe[1]
I don't know if it's worth it to introduce the following patch, it only
saves 200ms user CPU time on a 1G upload for an uncommon use case.
I'm concerned about the portability of IO#nread. Even if IO#nread is
available for the IO object, there's no guarantee the FIONREAD ioctl()
is implemented (or implemented correctly) for the underlying file type
given all the OSes Ruby supports.
--- a/lib/net/http.rb
+++ b/lib/net/http.rb
@@ -21,6 +21,7 @@
require 'net/protocol'
require 'uri'
+require 'io/wait'
autoload :OpenSSL, 'openssl'
module Net #:nodoc:
@@ -1965,9 +1966,25 @@ module Net #:nodoc:
write_header sock, ver, path
wait_for_continue sock, ver if sock.continue_timeout
if chunked?
-
chunker = Chunker.new(sock)
-
IO.copy_stream(f, chunker)
-
chunker.finish
-
io = sock.io
-
if io.respond_to?(:nread)
-
seekable_p = f.stat.file?
-
f.wait
-
len = f.nread
-
while len > 0
-
io.write("#{len.to_s(16)}\r\n")
-
IO.copy_stream(f, io, len)
-
f.pos += len if seekable_p
-
io.write("\r\n")
-
f.wait or break
-
len = f.nread
-
end
-
io.write("0\r\n\r\n")
-
else
-
chunker = Chunker.new(sock)
-
IO.copy_stream(f, chunker)
-
chunker.finish
-
end else # copy_stream can sendfile() to sock.io unless we use SSL. # If sock.io is an SSLSocket, copy_stream will hit SSL_write()
Since chunked encoding is normally used for pipe/sockets
of unknown length, I added the following to my test script to test:
r, w = IO.pipe
tmp.sysseek(0)
pid = fork { IO.copy_stream(tmp.to_io, w) }
w.close
Net::HTTP.start(host, port) do |http|
put = Net::HTTP::Put.new("/")
put.body_stream = r
put["Transfer-Encoding"] = "chunked"
puts "with (pipe) Transfer-Encoding: chunked"
bm = Benchmark.measure { res = http.request(put) }
p [ res, res.body ]
printf("utime=%0.3f stime=%0.3f\n", bm.utime, bm.stime)
end
Process.waitpid2(pid)
------------------------ Results -------------------------------
afterwith Content-Length: 1073741825
[#<Net::HTTPOK 200 OK readbody=true>,
"dcdd67a8f07b73796c0485890d48fa697966d09f\n"]
utime=0.010 stime=0.410
^-- Oops, I realized stime=0.410 is high
because I was on a cold cache, even
with a sparse file. It should be 0.090
with Transfer-Encoding: chunked
[#<Net::HTTPOK 200 OK readbody=true>,
"dcdd67a8f07b73796c0485890d48fa697966d09f\n"]
utime=0.320 stime=0.620
With a regular file, chunked becomes as fast as with Content-Length:¶
with (file) Transfer-Encoding: chunked
[#<Net::HTTPOK 200 OK readbody=true>, "dcdd67a8f07b73796c0485890d48fa697966d09f\n"]
utime=0.010 stime=0.090
I think a better alternative would be to teach Net::HTTP to
automatically set Content-Length if body_stream is a regular file.
Chunked requests is are still an uncommon feature, I think.
With a pipe, using IO.copy_stream calling Chunker¶
with (pipe) Transfer-Encoding: chunked
[#<Net::HTTPOK 200 OK readbody=true>, "dcdd67a8f07b73796c0485890d48fa697966d09f\n"]
utime=0.350 stime=0.940
With a pipe, using IO#nread loop calling IO.copy_stream¶
with (pipe) Transfer-Encoding: chunked
[#<Net::HTTPOK 200 OK readbody=true>, "dcdd67a8f07b73796c0485890d48fa697966d09f\n"]
utime=0.050 stime=0.940
[1] - We have to be careful since there are still old kernels with
a buggy splice() implementation.
Updated by mame (Yusuke Endoh) over 12 years ago
- Status changed from Open to Assigned
- Assignee set to naruse (Yui NARUSE)
I tentatively assign this issue to Naruse-san because
he is running for the maintainer of net/http.
--
Yusuke Endoh mame@tsg.ne.jp
Updated by naruse (Yui NARUSE) over 12 years ago
The original patch looks better than [ruby-core:40903]'s.
drbrain, do you have any idea?
If no one object it, I'll merge original one.
Updated by normalperson (Eric Wong) over 12 years ago
"naruse (Yui NARUSE)" naruse@airemix.jp wrote:
Issue #5605 has been updated by naruse (Yui NARUSE).
The original patch looks better than [ruby-core:40903]'s.
drbrain, do you have any idea?If no one object it, I'll merge original one.
I prefer the original, too. I don't think my patch in
[ruby-core:40903] is worth the effort/maintenance for a
corner case.
Updated by drbrain (Eric Hodel) over 12 years ago
I prefer the original too.
Updated by naruse (Yui NARUSE) over 12 years ago
- Status changed from Assigned to Closed
- % Done changed from 0 to 100
This issue was solved with changeset r35281.
Eric, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.
- lib/net/http.rb (Net::HTTP#send_request_with_body_stream):
use IO.copy_stream for requests using body_stream.
patched by Eric Wong. [ruby-core:40898] [Feature #5605]