Feature #17045

ObjectSpace.dump_all should allocate as little as possible in the GC heap

Added by byroot (Jean Boussier) 6 months ago. Updated 4 months ago.

Target version:


For context I'm working on a heap profiler. In short the use case is like this (pseudo code):

ObjectSpace.dump_all(output: file1)
# run the user code
ObjectSpace.dump_all(output: file2)
compute_the_diff_and_report_statistics(file1, file2)

Ideally I would need ObjectSpace.dump_all to not modify the GC heap at all so that an empty user code would report an empty diff.

However as showcased in this test case, dump_all(output: <#File:/path.json>) currently allocates 4 objects:

  • The File instance passed as output: is re-opened by rb_io_stdio_file in dump_output and I don't quite understand why.
  • The scan_args in dump_all allocates a Hash instance. Would using the new Primitive interface avoid that?
  • Another hash is allocated but I'm unsure where it comes from.
  • An IMEMO "imemo_type"=>"callcache" is allocated. Surprisingly it can be avoided by calling ObjectSpace.dump_all(**opts)

Could any of these be eliminated?


Updated by jeremyevans0 (Jeremy Evans) 6 months ago

  • Backport deleted (2.5: UNKNOWN, 2.6: UNKNOWN, 2.7: UNKNOWN)
  • Tracker changed from Bug to Feature

Updated by byroot (Jean Boussier) 4 months ago

I opened a PR with a patch for this:


Updated by byroot (Jean Boussier) 4 months ago

  • Status changed from Open to Closed

Applied in changeset git|fbba6bd4e3dff7a61965208fecae908f10c4edbe.

Parse ObjectSpace.dump_all / dump arguments in Ruby to avoid allocation noise

[Feature #17045] ObjectSpace.dump_all should allocate as little as possible in the GC heap

Up until this commit ObjectSpace.dump_all allocates two Hash because of rb_scan_args.

It also can allocate a File because of rb_io_get_write_io.

These allocations are problematic because dump_all dumps the Ruby
heap, so it should try modify as little as possible what it is

Also available in: Atom PDF