Feature #19783
Updated by peterzhu2118 (Peter Zhu) over 1 year ago
GitHub PR: https://github.com/ruby/ruby/pull/8113
I'm proposing support for weak references in the Ruby garbage collector. This
feature adds a new function called `void rb_gc_mark_weak(VALUE *ptr)` which
marks `*ptr` as weak, meaning that if no other object strongly marks `*ptr`
(using `rb_gc_mark` or `rb_gc_mark_movable`), then it will be overwritten with
`*ptr = Qundef`.
Weak references are implemented using a buffer in `objspace` that stores all
the `ptr` in the latest marking phase. After marking has finished, we iterate
over the buffer and check if the `*ptr` is a dead object. If it is, then we
set `*ptr = Qundef`.
Weak references are implemented on the callable method entry (CME) of
callcaches, which fixes issue #19436.
Weak references are also implemented on `ObjectSpace::WeakMap` and
`ObjectSpace::WeakKeyMap`, which have:
- Significantly simpler implementations because we no longer need to have
multiple tables and do not need to define finalizers on the objects.
- Support for compaction because finalizers pin objects and we no longer need
to define finalizers on the objects.
- Much faster performance (see [benchmarks](#microbenchmarks)).
## Metrics
This patch also adds two metrics, `GC.latest_gc_info(:weak_references_count)`
and `GC.latest_gc_info(:retained_weak_references_count)`. These two metrics
returns information about the number of weak references registered and the
number of weak references retained (references that did not point to a dead
object) in the last GC cycle.
## Benchmark results
### YJIT-bench
We see largely no change in performance or memory usage after this feature.
```
-------------- --------- ---------- --------- ----------- ---------- --------- -------------- -----------
bench base (ms) stddev (%) RSS (MiB) branch (ms) stddev (%) RSS (MiB) branch 1st itr base/branch
activerecord 72.3 2.2 51.9 72.9 2.2 51.9 0.99 0.99
chunky-png 889.2 0.3 43.9 874.5 0.3 42.5 1.02 1.02
erubi-rails 21.2 13.5 90.7 21.0 13.3 90.9 1.01 1.01
hexapdf 2557.0 0.8 157.1 2559.2 0.7 197.1 1.01 1.00
liquid-c 65.2 0.4 34.5 65.4 0.4 34.5 0.99 1.00
liquid-compile 62.5 0.4 30.9 62.2 0.4 31.0 1.00 1.01
liquid-render 164.6 0.4 33.1 162.6 0.3 33.1 1.01 1.01
mail 133.3 0.1 46.4 134.4 0.2 46.4 1.03 0.99
psych-load 2066.6 0.2 31.6 2083.6 0.1 31.6 0.99 0.99
railsbench 2027.0 0.5 88.8 2019.4 0.5 89.0 1.01 1.00
ruby-lsp 65.6 3.0 90.1 65.4 3.1 88.5 1.00 1.00
sequel 73.1 1.1 36.6 73.1 1.1 36.6 1.00 1.00
-------------- --------- ---------- --------- ----------- ---------- --------- -------------- -----------
```
### Microbenchmarks
We can see signficantly improved performance in `ObjectSpace::WeakMap`, with
`ObjectSpace::WeakMap#[]=` being nearly 3x faster.
Base:
```
ObjectSpace::WeakMap#[]=
1.037M (± 0.5%) i/s - 5.262M in 5.072833s
ObjectSpace::WeakMap#[]
12.367M (± 0.9%) i/s - 62.479M in 5.052365s
```
Branch:
```
ObjectSpace::WeakMap#[]=
3.054M (± 0.3%) i/s - 15.448M in 5.058783s
ObjectSpace::WeakMap#[]
15.796M (± 4.8%) i/s - 79.245M in 5.028583s
```
Code:
```ruby
require "bundler/inline"
gemfile do
source "https://rubygems.org"
gem "benchmark-ips"
end
wmap = ObjectSpace::WeakMap.new
key = Object.new
val = Object.new
wmap[key] = val
Benchmark.ips do |x|
x.report("ObjectSpace::WeakMap#[]=") do |times|
i = 0
while i < times
wmap[Object.new] = Object.new
i += 1
end
end
x.report("ObjectSpace::WeakMap#[]") do |times|
i = 0
while i < times
wmap[key]
wmap[val] # does not exist
i += 1
end
end
end
```