Project

General

Profile

Actions

Feature #19729

closed

Store object age in a bitmap

Added by eightbitraptor (Matt V-H) over 1 year ago. Updated over 1 year ago.

Status:
Closed
Assignee:
-
Target version:
-
[ruby-core:113898]

Description

Github PR

Store object age in a bitmap #7938

Abstract

Ruby currently uses 2 bits of the flags in every object to track how many GC
events the object has survived. Objects are created at age 0 and can currently
grow to age 3, at which point they are considered "old".

Similar to the work carried out for bitmap marking in Ruby
2.0
which moved the mark bit from the
flags to a bitmap attached to a heap page; This PR moves the age bits out of the
object flags and stores them in a bitmap.

Description

This PR creates a new bitmap age_bits on each heap page. the size of the
bitmap is controlled by HEAP_PAGE_BITMAP_LIMIT, which roughly indicates how
many objects need to be considered by that bitmap, and RVALUE_AGE_BITS_SIZE
which varies how many bits we use per object to store the age.

We also introduce functions RVALUE_AGE_SET, RVALUE_AGE_GET,
RVALUE_AGE_RESET and RVALUE_AGE_INC to manipulate the age of an object
pointed to by a VALUE.

Impact

Benefits:

  • Improved CoW performance, GC will modify the object header fewer times for
    fewer objects.
  • Allow configuration of the age at which objects are considered old. Because
    the number of bits used is configurable, we can support arbitrary numbers of
    GC events before an object is considered old. We can use this to change
    major/minor GC timings for workloads with unusually high/low object churn.
  • Free a flag in each object that can be repurposed. Object flags are a precious
    resource, we should prefer to store data outside the flags where possible.
  • Remove GC related concerns from the object structure. This is important for
    initiatives like a generic GC interface and MMTk in order to keep GC related
    code as isolated as possible.

Concerns:

  • Slightly increased RSS, because we now allocate an extra 2 bits per heap page
    slot, to create the age_bits bitmap on VM bootup. On my machine
    sizeof(struct heap_page) has increased from 1312 to 1728 bytes.

Benchmarking

Railsbench and Optcarrot benchmarked using yjit-bench. Showing small memory
increase, but comparable performance.

master: ruby 3.3.0dev (2023-06-08T11:22:43Z master 3fe09eba9d) [x86_64-linux]
mvh-rvalue-age-bitmap: ruby 3.3.0dev (2023-06-13T06:59:40Z master c74f42a4fb) [x86_64-linux]

----------  -----------  ----------  ---------  --------------------------  ----------  ---------  -----------------------------  ----------------------------
bench       master (ms)  stddev (%)  RSS (MiB)  mvh-rvalue-age-bitmap (ms)  stddev (%)  RSS (MiB)  mvh-rvalue-age-bitmap 1st itr  master/mvh-rvalue-age-bitmap
railsbench  2154.3       0.5         101.4      2124.9                      0.5         101.5      1.01                           1.01
optcarrot   5372.2       0.6         55.0       5282.3                      0.6         55.1       1.02                           1.02
----------  -----------  ----------  ---------  --------------------------  ----------  ---------  -----------------------------  ----------------------------
Legend:
- mvh-rvalue-age-bitmap 1st itr: ratio of master/mvh-rvalue-age-bitmap time for the first benchmarking iteration.
- master/mvh-rvalue-age-bitmap: ratio of master/mvh-rvalue-age-bitmap time. Higher is better for mvh-rvalue-age-bitmap. Above 1 represents a speedup.

Notes

FL_PROMOTED

We still use one of the two age bits. The original FL_PROMOTED0 has been
renamed to FL_PROMOTED and has been repurposed to indicate an objects
old/young status.

We need this because correctly tracking references from old to young objects
relies on a write barrier, that's triggered whenever a field on an object is
written to. Because this code is a very hot path it needs to be fast. Looking up
the heap page and then calculating the old/young status based on the age bits
would slow down this part of the code too much.

Instead we set FL_PROMOTED whenever the object crosses the threshold into the
old gen, and unset it if that object ever gets demoted back into the young gen.
This way the write barrier can quickly tell whether the object is old or not and
whether to add it to the rememberset.

Actions

Also available in: Atom PDF

Like1
Like0Like0Like1Like0