Bug #20485: Simple use of Fiber makes GC leak objects with singleton method - Ruby - Ruby Issue Tracking System

Actions

Copy link

Bug #20485

closed

Simple use of Fiber makes GC leak objects with singleton method

Added by skhrshin (Shintaro Sakahara) over 1 year ago. Updated over 1 year ago.

Status:

Closed

Assignee:

Target version:

ruby -v:

ruby 3.2.4 (2024-04-23 revision af471c0e01) [x86_64-linux]

Backport:

3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN

[ruby-core:117838]

Description

I found a possible memory leak which occurs only when several conditions are met.

The code to reproduce the problem is below:

class Work
  def add_method
    singleton_class.define_method(:f) {}
  end
end

1.times { Fiber.new {}.resume }

work = Work.new
work.add_method
work = nil
GC.start

num_objs = ObjectSpace.each_object.select { |o| o.is_a?(Work) rescue false }.size
unless num_objs.zero?
  raise "NG"
end

Expected result: The script exits normally.
Actual result: RuntimeError "NG" is raised.

If I change 1.times { Fiber.new {}.resume } to just Fiber.new {}.resume or remove work.add_method, GC works as expected.
Is there any problem at the way to use Fiber in this code, or is it a bug due to Ruby?

I tested ruby 3.3.1 (2024-04-23 revision c56cd86388) [x86_64-linux] too and the result was a little different. The code above didn't reproduce the problem, but if I changed 1.times to Mutex.new.synchronize, it was able to reproduce.

Related issues 1 (0 open — 1 closed)

Actions

Copy link

#1 [ruby-core:117842]

Updated by skhrshin (Shintaro Sakahara) over 1 year ago

Subject changed from Simple use of Mutex and Fiber makes GC leak objects with singleton method to Simple use of Fiber makes GC leak objects with singleton method
Description updated (diff)

Update: Using Mutex was not necessary.

Actions

Copy link

Updated by skhrshin (Shintaro Sakahara) over 1 year ago

Description updated (diff)

Actions

Copy link

#3 [ruby-core:117847]

Updated by skhrshin (Shintaro Sakahara) over 1 year ago

Description updated (diff)

Update: To reproduce this issue with Ruby 3.3.1, Mutex is necessary.

Actions

Copy link

#4 [ruby-core:117848]

Updated by skhrshin (Shintaro Sakahara) over 1 year ago

Changing 1.times to [1].each could reproduce the problem on Ruby 3.3.1 too.

Actions

Copy link

#5 [ruby-core:117849]

Updated by skhrshin (Shintaro Sakahara) over 1 year ago

ruby -v changed from ruby 3.2.3 (2024-01-18 revision 52bb2ac0a6) [x86_64-linux] to ruby 3.2.4 (2024-04-23 revision af471c0e01) [x86_64-linux]

I confirmed that all of 1.times, [1].each and Mutex.new.synchronize versions reproduce the problem on Ruby 3.2.4.

Actions

Copy link

#6 [ruby-core:117850]

Updated by byroot (Jean Boussier) over 1 year ago

Status changed from Open to Closed

Looks like a duplicate of https://bugs.ruby-lang.org/issues/19436, fixed in Ruby 3.3 but can't really be backported.

Actions

Copy link

Updated by byroot (Jean Boussier) over 1 year ago

Related to Bug #19436: Call Cache for singleton methods can lead to "memory leaks" added

Actions

Copy link

#8 [ruby-core:117852]

Updated by skhrshin (Shintaro Sakahara) over 1 year ago

Do you mean this is fixed in trunk? Or are you saying this shouldn't happen on Ruby 3.3.1? If latter, that is not correct as I wrote [1].each and Mutex.new.synchronize versions reproduce the problem on Ruby 3.3.1.
I would like you to reopen this issue. Should I update ruby -v to 3.3.1 (2024-04-23 revision c56cd86388) [x86_64-linux] here, or should I create a new issue?

Actions

Copy link

#9 [ruby-core:117854]

Updated by byroot (Jean Boussier) over 1 year ago

I closed because I tried your repro script with ruby 3.3.1 (2024-04-23 revision c56cd86388) [arm64-darwin23] both with 1.times and Mutex.new.synchronize and it doesn't fail.

Also your description really fit [Bug #19436], hence why I considered it a duplicate.

If you say you can reproduce it on 3.3.1, I'll re-open, but then I have no explanation why it doesn't reproduce on my machine.

Actions

Copy link

#10

Updated by byroot (Jean Boussier) over 1 year ago

Status changed from Closed to Open

Actions

Copy link

#11 [ruby-core:117855]

Updated by byroot (Jean Boussier) over 1 year ago

To be honest I also tried 3.2.2 and 3.1.4, each with [1].each, 1.times and Mutex.new.synchronize, and neither reproduced.

So I'm starting to wonder if it isn't simply that for some reason one object consistently end up on the stack in your environment.

Actions

Copy link

#12 [ruby-core:117856]

Updated by byroot (Jean Boussier) over 1 year ago

@skhrshin if you can reproduce consistently, what could be helpful would be to provide a heap dump like this (use some service list GitHub gist because the output might be big:

require 'objspace'

class Work
  def add_method
    singleton_class.define_method(:f) {}
  end
end

Mutex.new.synchronize { Fiber.new {}.resume }

work = Work.new
work.add_method
puts ObjectSpace.dump(work)

work = nil
GC.start

num_objs = ObjectSpace.each_object(Work).count
unless num_objs.zero?
  puts '-' * 40
  puts ObjectSpace.dump_all(output: :stdout)
  raise "NG"
end

That would allow us to trace back what's preventing the object from being garbage collected.

Actions

Copy link

#13 [ruby-core:117857]

Updated by skhrshin (Shintaro Sakahara) over 1 year ago

I asked my co-workers to try this script and some of them gave me their results. The following table includes my results.

Environment	# of people	Reproducibility
ruby 3.3.1 (2024-04-23 revision c56cd86388) [x86_64-linux] on Ubuntu/WSL2	1	Probably 100%
ruby 3.3.1 (2024-04-23 revision c56cd86388) [x86_64-linux] on Ubuntu/virtualbox	2	Very high but less than 100%
ruby 3.3.1 (2024-04-23 revision c56cd86388) [x86_64-linux] on Docker Desktop/Windows	1	High but less than 100%
ruby 3.3.1 (2024-04-23 revision c56cd86388) [x86_64-linux] on Ubuntu/Hyper-V	1	Low (about 10%)
ruby 3.3.1 (2024-04-23 revision c56cd86388) +YJIT [arm64-darwin23]	1	0%
ruby 3.3.1 (2024-04-23 revision c56cd86388) [x86_64-darwin22]	1	0%

The person who tried it on ruby 3.3.1 (2024-04-23 revision c56cd86388) +YJIT [arm64-darwin23] also gave me the results on several Ruby versions. He said it was reproducible on 3.2.4, but not on 3.2.2.

I created a dump log by putting ObjectSpace.dump_all(output: :stdout) before raise "NG" and uploaded it to GitHub. This log doesn't contain ObjectSpace.dump(work) you've suggested because with putting something like ObjectSpace.dump(work), puts 0 or sleep 1 between work.add_method and work = nil the script doesn't reproduce the problem.

https://gist.github.com/skhrshin/f639e387578db8faf431adfb7ac06631#file-bugs-ruby-lang-org_issues_20485_dump_all-log

As far as I investigated, I couldn't find any OBJECT that prevented work from being GCed. The address of work looks to be 0x7f2b552d0bb8. I don't have any knowledge about what IMEMO is. I would appreciate it if you could help me.

Actions

Copy link

#14 [ruby-core:117860]

Updated by byroot (Jean Boussier) over 1 year ago

Alright, looking at your dump:

{"address":"0x7f2b570455b8", "type":"CLASS", "shape_id":2, "slot_size":160, "class":"0x7f2b57045518", "variation_count":0, "superclass":"0x7f2b5707fd30", "name":"Work", "references":["0x7f2b5707fd30", "0x7f2b552d0d98", "0x7f2b704fea00", "0x7f2b552d0b90", "0x7f2b552d0d98", "0x7f2b552d0b68", "0x7f2b552d0b40", "0x7f2b552d0b18", "0x7f2b552d41f0"], "memsize":488, "flags":{"wb_protected":true, "old":true, "uncollectible":true, "marked":true}}

This is the Work class.

{"address":"0x7f2b57045478", "type":"CLASS", "shape_id":2, "slot_size":160, "class":"0x7f2b57045518", "variation_count":0, "superclass
":"0x7f2b570455b8", "real_class_name":"Work", "singleton":true, "references":["0x7f2b552d0bb8", "0x7f2b570455b8", "0x7f2b552d0a50", "0x7f2b704fe910", "0x7f2b552d0a28"], "memsize":384, "flags":{"wb_protected":true, "old":true, "uncollectible":true, "marked":true}}

This is the Work instance singleton class ("superclass":"0x7f2b570455b8").

{"address":"0x7f2b552d0bb8", "type":"OBJECT", "shape_id":5, "slot_size":40, "class":"0x7f2b57045478", "embedded":true, "ivars":0, "memsize":40, "flags":{"wb_protected":true}}

Is the Work instance ("class":"0x7f2b57045478").

Using harb I can see it's referenced by the Proc and the singleton class:

harb> print 0x7f2b552d0bb8
    0x7f2b552d0bb8: "OBJECT"
             class: (null)
           memsize: 40
  retained memsize: 40
   referenced from: [
                      0x7f2b552d0a78 (DATA: proc)
                      0x7f2b57045478 (CLASS: (null))
                    ]

Which is expected.

However following both references, there is no path to the root. So my understanding is simply that one of these references is left on the C stack, and since Ruby's GC is conservative, it cannot know for sure if this is a true reference or not, so it doesn't collect the object.

To further prove that this isn't a leak, you could loop in your reproduction script. I suspect the "leaked" objects count will remain at one.

Actions

Copy link

#15 [ruby-core:117861]

Updated by skhrshin (Shintaro Sakahara) over 1 year ago

I tried looping with 100.times in a more complex case that I created in the middle of the entire investigation of a test suite whose memory usage keeps growing until getting killed due to OOM, and as you said, the "leaked" object remaining was only one. So I conclude that this behavior is not a problem. I apologize for wasting your time. Thank you for the great help.

Actions

Copy link

#16 [ruby-core:117867]

Updated by byroot (Jean Boussier) over 1 year ago

Status changed from Open to Closed

No worries. This aspect of Ruby GC often confuses people.

Actions

Copy link

Also available in: Atom PDF

Like0

Like0Like0Like0Like0Like0Like1Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0

Project

General

Profile

Ruby

Tags

Custom queries

Bug #20485

Simple use of Fiber makes GC leak objects with singleton method

Updated by skhrshin (Shintaro Sakahara) over 1 year ago

Updated by skhrshin (Shintaro Sakahara) over 1 year ago

Updated by skhrshin (Shintaro Sakahara) over 1 year ago

Updated by skhrshin (Shintaro Sakahara) over 1 year ago

Updated by skhrshin (Shintaro Sakahara) over 1 year ago

Updated by byroot (Jean Boussier) over 1 year ago

Updated by byroot (Jean Boussier) over 1 year ago

Updated by skhrshin (Shintaro Sakahara) over 1 year ago

Updated by byroot (Jean Boussier) over 1 year ago

Updated by byroot (Jean Boussier) over 1 year ago

Updated by byroot (Jean Boussier) over 1 year ago

Updated by byroot (Jean Boussier) over 1 year ago

Updated by skhrshin (Shintaro Sakahara) over 1 year ago

Updated by byroot (Jean Boussier) over 1 year ago

Updated by skhrshin (Shintaro Sakahara) over 1 year ago

Updated by byroot (Jean Boussier) over 1 year ago