Bug #14745
closedHigh memory usage when using String#replace with IO.copy_stream
Description
I'm using custom IO-like objects that implement #read as the first argument to IO.copy_stream, and I noticed odd memory behaviour when using String#replace on the output buffer versus String#clear. Here is an example of a "fake IO" object where #read uses String#clear on the output buffer:
GC.disable
require "stringio"
class FakeIO
def initialize(content)
@io = StringIO.new(content)
end
def read(length, outbuf)
chunk = @io.read(length)
if chunk
outbuf.clear
outbuf << chunk
chunk.clear
else
outbuf.clear
end
outbuf unless outbuf.empty?
end
end
io = FakeIO.new("a" * 50*1024*1024) # 50MB
IO.copy_stream(io, File::NULL)
system "top -pid #{Process.pid}"
This program outputs memory usage of 50MB at the end, as expected – 50MB was loaded into memory at the beginning and any new strings are deallocated. However, if I modify the #read implementation to use String#replace instead of String#clear:
def read(length, outbuf)
chunk = @io.read(length)
if chunk
outbuf.replace chunk
chunk.clear
else
outbuf.clear
end
outbuf unless outbuf.empty?
end
the memory usage has now doubled to 100MB at the end of the program, indicating that some string bytes weren't successfully deallocated. So, it seems that String#replace has different behaviour compared to String#clear + String#<<.
I was only able to reproduce this with IO.copy_stream
, the following program shows 50MB memory usage, regardless of whether the String#clear or String#replace approach is used:
GC.disable
buffer = "a" * 50*1024*1024
chunk = "b" * 50*1024*1024
if ARGV[0] == "clear"
buffer.clear
buffer << chunk
else
buffer.replace chunk
end
chunk.clear
system "top -pid #{Process.pid}"
With this program I also noticed one interesting thing. If I remove chunk.clear
, then the "clear" version uses 100MB as expected (because both buffer and chunk strings are 50MB large), but the "replace" version uses only 50MB, which makes it appear that the buffer
string doesn't use any memory when in fact it should use 50MB just like the chunk
string. I found that odd, and I think it might be a clue to the memory bug with String#replace I experienced when using IO.copy_stream
.
Updated by normalperson (Eric Wong) over 6 years ago
janko.marohnic@gmail.com wrote:
the memory usage has now doubled to 100MB at the end of the
program, indicating that some string bytes weren't
successfully deallocated. So, it seems that String#replace has
different behaviour compared to String#clear + String#<<.
Yes, this is an unfortunate side effect because of copy-on-write
semantics of String#replace. If the arg (other_str) for
String#replace is non-frozen, a new frozen string is created with
using the existing malloc-ed pointer. Both the receiver string
and other_str point to that new, shared string.
I was only able to reproduce this with
IO.copy_stream
I suspect part of this is because outbuf becomes a long-lived object
with IO.copy_stream (not sure), and the hidden frozen string
becomes long-lived, as well.
So yeah; a combination of well-intentioned optimizations hurt
when combined together.
The other part could be anything using IO#write could create
massive amounts of garbage before 2.5:
https://bugs.ruby-lang.org/issues/13085 And your copy_stream use
hits the IO#write case. Unfortunately, the "fix" for [Bug
#13085] won't work effectively with shared strings, because we
can't free+recycle the a string which wasn't created internally
by the VM.
Updated by normalperson (Eric Wong) over 6 years ago
janko.marohnic@gmail.com wrote:
def read(length, outbuf) chunk = @io.read(length) if chunk outbuf.replace chunk chunk.clear
Elaborating on my previous comment, chunk.clear is a no-op in
this case because the shared frozen string is still in play and
now used by outbuf.
else outbuf.clear
Likewise, the final #clear is also no-op because the
hidden+frozen string is still floating around waiting to be
GC-ed.
end outbuf unless outbuf.empty?
end
I don't think you need IO.copy_stream or IO#write to trigger, even.
io = FakeIO.new("a" * 50*1024*1024) # 50MB
buf = ''.b
while z = io.read(16384, buf)
end
I tried playing with shortening object lifetimes, but I guess GC
is not aggressive enough by default to matter.
Updated by normalperson (Eric Wong) over 6 years ago
janko.marohnic@gmail.com wrote:
def initialize(content)
@io = StringIO.new(content)
enddef read(length, outbuf)
chunk = @io.read(length)
Finally, I always assumed your example is a contrived case and
you're dealing with an interface somewhere (not StringIO) which
doesn't accept a destination buffer for .read.
Fwiw, I encountered a similar problem with Net::HTTP and
Zlib::GzipReader earlier this year and made r61665 to ensure
outbuf got used all the way down the call chain.
Sometimes I guess modifying/fixing interfaces is the only way to go.
(unless we can do escape analysis, but I'm not sure how feasible
it is with a dynamic language)
Updated by janko (Janko Marohnić) over 6 years ago
Yes, this is an unfortunate side effect because of copy-on-write
semantics of String#replace. If the arg (other_str) for
String#replace is non-frozen, a new frozen string is created with
using the existing malloc-ed pointer. Both the receiver string
and other_str point to that new, shared string.So yeah; a combination of well-intentioned optimizations hurt
when combined together.
That makes sense, thanks for the explanation.
The other part could be anything using IO#write could create
massive amounts of garbage before 2.5
This is on Ruby 2.5.1, so I'm guessing it doesn't apply here.
Finally, I always assumed your example is a contrived case and
you're dealing with an interface somewhere (not StringIO) which
doesn't accept a destination buffer for .read.
The example was simplified for reproducing purposes. The place where I discovered this was in https://github.com/rocketjob/symmetric-encryption/pull/98 (I eventually managed to figure out String#replace
was causing the high memory usage, so I switched to String#clear
).
In short, the SymmetricEncryption::Reader
object wraps an IO object with encrypted content, and when calling #read
it reads data from the underlying IO object, decrypts it and returns the decrypted data. So, it's not patching the lack of outbuf argument (because the underlying IO object should accept the outbuf argument), rather it provides an IO#read
interface over incrementally decrypting IO object content.
If this is necessary behaviour when having the copy-on-write optimization, feel free to close this ticket then.
Updated by normalperson (Eric Wong) over 6 years ago
janko.marohnic@gmail.com wrote:
I wrote:
Finally, I always assumed your example is a contrived case and
you're dealing with an interface somewhere (not StringIO) which
doesn't accept a destination buffer for .read.
The example was simplified for reproducing purposes. The place
where I discovered this was in
https://github.com/rocketjob/symmetric-encryption/pull/98 (I
eventually managed to figure outString#replace
was causing
the high memory usage, so I switched toString#clear
).In short, the
SymmetricEncryption::Reader
object wraps an IO
object with encrypted content, and when calling#read
it
reads data from the underlying IO object, decrypts it and
returns the decrypted data. So, it's not patching the lack of
outbuf argument (because the underlying IO object should
accept the outbuf argument), rather it provides anIO#read
interface over incrementally decrypting IO object content.
I may be misreading(*), but it looks like @stream_cipher.update
can take a second destination arg (like IO#read and friends) and
maybe that helps... (that appears to be OpenSSL::Cipher#update)
If that's not usable somehow, I've sometimes wondered if adding a
String#exchange! method might help:
foo = "hello............................"
bar = "world............................"
foo.exchange!(bar)
p foo # => "world............................"
p bar # => "hello............................"
In your case, you could still use String#clear with this:
dst.clear
dst.exchange!(tmp)
# tmp is now '' because dst was cleared before
For non-embed strings, it would swap the pointers and be zero-copy,
but won't suffer the new-frozen-string-behind-your-back problem
of String#replace.
(*) I'm looking at your commit 8d41efacff26f3357016d6b611bee174802fba66
after git clone-ing (since I don't do JS-infested web UIs)
Updated by janko (Janko Marohnić) over 6 years ago
it looks like @stream_cipher.update can take a second destination arg (like IO#read and friends) and maybe that helps... (that appears to be OpenSSL::Cipher#update)
Cool, I didn't know that! Thanks, this allowed me to greatly simplify the #read
implementation, so now it's both simpler and has lower memory usage than the original implementation.
I've sometimes wondered if adding a String#exchange! method might help
That sounds very useful to have in general. In my particular scenario (the PR) I wouldn't have use of it after all, this String#clear
vs String#replace
thing was just something I encountered during the optimization, the current version doesn't need String#replace
. But I can imagine String#exchange!
being another useful tool for diligent string management.
Feel free to close the ticket, as this seems to be a necessary consequence of the copy-on-write optimization, and I'm not having problems with it anymore.
Updated by normalperson (Eric Wong) over 6 years ago
Good to know the destination buffer to #update helped :>
But I can imagine
String#exchange!
being a useful tool for
diligent string management.
Maybe... As it stands, destination buffers for #read and #update
are underutilized and I'm glad to see them being used more.
Thus I'm not hopeful of #exchange! being widely adopted...
When used, destination buffers would still offer better performance,
anyways, as there's no extra object slot used.
Updated by jeremyevans0 (Jeremy Evans) about 5 years ago
- Status changed from Open to Closed