Project

General

Profile

Actions

Feature #21353

closed

Add shape_id to RBasic under 32 bit

Added by byroot (Jean Boussier) 11 days ago. Updated 5 days ago.

Status:
Closed
Assignee:
-
Target version:
-
[ruby-core:122203]

Description

Currently on 64bit systems, for every types, the shape_id is stored inside the RBasic.flags field, and is 32bit
long.

However, on 32bit systems like i686 and WASM, it is much more complicated.
For T_OBJECT, T_CLASS and T_MODULE, the shape_id is stored as part of "user flags" in FL_USER4-19,
and for all other types it's stored alongside the instance variable in the generic_fields_tbl, which means
a hash lookup is required to access it.

This situation makes a lot of routine noticeably more complicated, with numerous codepath taken only by 32bit systems,
because to avoid doing two hash-lookup per ivar access, the code need a lot of contortions.
You can look for SHAPE_IN_BASIC_FLAGS to have an idea of the added complexity.

In addition, it forces us to duplicate some bits of information. For instance RUBY_FL_FREEZE is redundant with the
shape_id. The shape already record that the object is frozen, and the only reason RUBY_FL_FREEZE hasn't been
eliminated is because on 32bits the shape_id isn't store inline for some objects.

Similarly, RUBY_FL_EXIVAR is redundant with the shape_id, because to know whether an object has ivars, you can
simply check if shape_id == 0.

Reclaiming these bits would be very useful for Ractors, as we'd need two bits in objects to be able to implement
lightwieght locks.

Yet another complication, is that on 32bit systems, the shape_id is only 16bits long.
I have the project to use the upper bits of the shape_id to store metadata, such as the frozen and too_complex
status, allowing to test for this without chasing a pointer: https://github.com/ruby/ruby/pull/13289.
But this currently can't be done on 32bit sytems, both because accessing the shape_id might require a hash-lookup,
and also because it's only 16bits long, so every single bit used for tagging severely restrict the maximum number of
shapes.

Proposal

To simplify all this, we propose that on 32bit systems, we add a VALUE shape_id in RBasic:

struct RBasic {
    VALUE flags;
    const VALUE klass;
#if RBASIC_SHAPE_ID_FIELD
    VALUE shape_id;
#endif
}

This ensure that on 32bits, all objects have their shape_id always at the same predictable offset, and 32bits long.

As you can see on the pull request, it simplify the code quite significantly: https://github.com/ruby/ruby/pull/13341,
and there's more cleanup that can be done.

The downside obviously is that on 32bit, objects would grow from 20B to 24B.

Pull Request

You can find the proposed patch at: https://github.com/ruby/ruby/pull/13341

cc @tenderlovemaking (Aaron Patterson) and @jhawthorn (John Hawthorn)

Also FYI @katei (Yuta Saito) because as the maintainer of WASM I assume this impact you the most.

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0