Bug #13597
closedDoes read_nonblock call remalloc for the buffer if does it just set the size attribute
Description
Hello
I've observed that a lot of memory gets allocated and wasted when read_nonblock is called for a number of bytes much larger than is actually read from the socket.
This line https://github.com/ruby/ruby/blob/0130bb01baed404c0e3c75bd5db472415a6da1d3/io.c#L2686 appears to eventually only change the heap size value here https://github.com/ruby/ruby/blob/144e06700705a3f067582682567bc77b429c4fca/string.c#L104 but does not call remalloc.
I see this request to allow an offset to be passed to read_nonblock:
https://bugs.ruby-lang.org/issues/11484
but until that is implemented, how do you recommend efficiently asking to read a large number of bytes from a socket? If I'm not mistaken, if I request 16000000, but only read 1000000, the buffer that has been allocated in io_read_nonblock for 16000000 doesn't seem to be resized.
Would you recommend instead requesting a more predictable number of bytes, closer to the default system value (SO_RCVBUF, for example) in each call to read_nonblock?
For context, this pull request against the MongoDB Ruby driver has lead me to this investigation. https://github.com/mongodb/mongo-ruby-driver/pull/864
Thank you in advance
Emily
Updated by normalperson (Eric Wong) almost 7 years ago
emily@mongodb.com wrote:
Hello
I've observed that a lot of memory gets allocated and wasted
whenread_nonblock
is called for a number of bytes much larger
than is actually read from the socket. This line
https://github.com/ruby/ruby/blob/0130bb01baed404c0e3c75bd5db472415a6da1d3/io.c#L2686
appears to eventually only change the heap size value here
https://github.com/ruby/ruby/blob/144e06700705a3f067582682567bc77b429c4fca/string.c#L104
but does not call remalloc.
Correct. We do not realloc here since there is a good chance
the buffer can be reused soon after and need the larger size.
realloc can be very expensive.
I see this request to allow an offset to be passed to
read_nonblock
:
https://bugs.ruby-lang.org/issues/11484
Thanks for pinging on that, I guess I'll try implementing it at
some point (but I will need matz approval to make API changes).
but until that is implemented, how do you recommend
efficiently asking to read a large number of bytes from a
socket? If I'm not mistaken, if I request 16000000, but only
read 1000000, the buffer that has been allocated in
io_read_nonblock
for 16000000 doesn't seem to be resized.
You can use String#clear
right away on the result:
rbuf = ''
tmp = ''
case ret = io.read_nonblock(16384, tmp, exception: false)
when String
# tmp.object_id == ret.object_id at this point
rbuf << ret
ret.clear # calls free(3) internally
else
...
end while true
And you can also clear the bigger rbuf
when you're done.
Coincidentally, I made a similar change to net/protocol for
net/http in the stdlib this weekend:
https://svn.ruby-lang.org/cgi-bin/viewvc.cgi?view=revision&revision=58840
But of course, I expect a destination offset [Feature #11484]
to be more helpful.
Would you recommend instead requesting a more predictable
number of bytes, closer to the default system value
(SO_RCVBUF, for example) in each call to read_nonblock?
That might be too complicated and a waste of syscalls in the
general case. I'm not sure I saw value in going with sizes
larger than 1MB, and usually 16K is fine. Using giant values
like 16MB will blow away your CPU cache. Maybe, (just maybe)
16MB helps with really big transfers across LFNs
(long-fat-networks), but I doubt that's a a common case for DBs
:)
For context, this pull request against the MongoDB Ruby driver has lead me to this investigation. https://github.com/mongodb/mongo-ruby-driver/pull/864
I don't agree with GitHub's Terms-of-Service nor do I run
Javascript or look at images; but I dumped that text and read
it; so I'll add some notes here:
In my experience, 4K is too small for even 70ms latency
connections, but that might've just been on the writing
side... I would choose 8K, at least, but usually 16K. It
also depends on network latency and hardware.
Choosing 16K also has a good side effect with current CRuby: a
malloc implementation can internally reuse space which Ruby
uses internally for buffers; potentially reducing
fragmentation and helping cache latency. And we (CRuby) have
been using 16K for most IO buffers for a long time...
Anyways, I'll be glad to help with further network-related
Ruby stuff on here as long as everything is plain text.
Updated by emilys (Emily Stolfo) almost 7 years ago
Hi Eric
Thank you so much for your response - it provided a lot of useful information I didn't know otherwise. I've pointed the user who opened the pull request to your response so he has the chance to update his code based on the new information.
I haven't heard back from him yet but in the meantime, I'll do some testing and see what I find to be the optimal solution. I'll certainly ping you again if I have questions...and will also look forward to perhaps having the ability to pass an offset to read_noblock in the future.
Thanks again
Emily
normalperson (Eric Wong) wrote:
emily@mongodb.com wrote:
Hello
I've observed that a lot of memory gets allocated and wasted
when read_nonblock is called for a number of bytes much larger
than is actually read from the socket. This line
https://github.com/ruby/ruby/blob/0130bb01baed404c0e3c75bd5db472415a6da1d3/io.c#L2686
appears to eventually only change the heap size value here
https://github.com/ruby/ruby/blob/144e06700705a3f067582682567bc77b429c4fca/string.c#L104
but does not call remalloc.Correct. We do not realloc here since there is a good chance
the buffer can be reused soon after and need the larger size.
realloc can be very expensive.I see this request to allow an offset to be passed to read_nonblock:
https://bugs.ruby-lang.org/issues/11484Thanks for pinging on that, I guess I'll try implementing it at
some point (but I will need matz approval to make API changes).but until that is implemented, how do you recommend
efficiently asking to read a large number of bytes from a
socket? If I'm not mistaken, if I request 16000000, but only
read 1000000, the buffer that has been allocated in
io_read_nonblock for 16000000 doesn't seem to be resized.You can use String#clear right away on the result:
rbuf = '' tmp = '' case ret = io.read_nonblock(16384, tmp, exception: false) when String # tmp.object_id == ret.object_id at this point rbuf << ret ret.clear # calls free(3) internally else ... end while true
And you can also clear the bigger rbuf when you're done.
Coincidentally, I made a similar change to net/protocol for
net/http in the stdlib this weekend:https://svn.ruby-lang.org/cgi-bin/viewvc.cgi?view=revision&revision=58840
But of course, I expect a destination offset [Feature #11484]
to be more helpful.Would you recommend instead requesting a more predictable
number of bytes, closer to the default system value
(SO_RCVBUF, for example) in each call to read_nonblock?That might be too complicated and a waste of syscalls in the
general case. I'm not sure I saw value in going with sizes
larger than 1MB, and usually 16K is fine. Using giant values
like 16MB will blow away your CPU cache. Maybe, (just maybe)
16MB helps with really big transfers across LFNs
(long-fat-networks), but I doubt that's a a common case for DBs
:)For context, this pull request against the MongoDB Ruby driver has lead me to this investigation. https://github.com/mongodb/mongo-ruby-driver/pull/864
I don't agree with GitHub's Terms-of-Service nor do I run
Javascript or look at images; but I dumped that text and read
it; so I'll add some notes here:In my experience, 4K is too small for even 70ms latency
connections, but that might've just been on the writing
side... I would choose 8K, at least, but usually 16K. It
also depends on network latency and hardware.Choosing 16K also has a good side effect with current CRuby: a
malloc implementation can internally reuse space which Ruby
uses internally for buffers; potentially reducing
fragmentation and helping cache latency. And we (CRuby) have
been using 16K for most IO buffers for a long time...Anyways, I'll be glad to help with further network-related
Ruby stuff on here as long as everything is plain text.
Updated by akr (Akira Tanaka) over 6 years ago
I think that it's possible to call remalloc when
"outbuf" argument is not supplied to read_nonblock.
It makes possible to
automatically reduce memory with remalloc (without supplying "outbuf") and
reuse buffer without remalloc (with supplying "outbuf").
Updated by akr (Akira Tanaka) over 6 years ago
We discussed this in a developer meeting,
it is better to call realloc when outbuf is not given and
maxlen - len is bigger than 4K where len is the length actually read.
(if maxlen - len is too short, freeing memory is not worth than
realloc's cost (such as acquiring a lock))
Updated by nobu (Nobuyoshi Nakada) over 6 years ago
- Tracker changed from Misc to Bug
- Backport set to 2.2: UNKNOWN, 2.3: UNKNOWN, 2.4: UNKNOWN
Updated by nobu (Nobuyoshi Nakada) over 6 years ago
- Status changed from Open to Closed
Applied in changeset trunk|r59701.
io.c: shrink read buffer
-
io.c (io_setstrbuf): return true if the buffer is newly created.
-
io.c (io_set_read_length): shrink the read buffer if it is a new
object and is too large. [ruby-core:81370] [Bug #13597]