Project

General

Profile

Actions

Feature #21722

open

Expose rb_gc_mark_weak API for use in extensions

Feature #21722: Expose rb_gc_mark_weak API for use in extensions

Added by ivoanjo (Ivo Anjo) about 4 hours ago. Updated 12 minutes ago.

Status:
Open
Assignee:
-
Target version:
-
[ruby-core:123961]

Description

In https://bugs.ruby-lang.org/issues/21710 it came up that

  1. On top of deprecating _id2ref on Ruby 4.0, it's a bad idea to be using object_id from the NEWOBJ tracepoint

  2. rb_gc_mark_weak which would be the alternative for an extension that needs weak reference-like behavior is not available for extensions

So I've opened this ticket to request exposing rb_gc_mark_weak so it can be used by extensions?


The Datadog Ruby profiler is currently using object_id and id2ref to implement its "heap profiling" -- that is, we have a NEWOBJ tracepoint, and from time to time (e.g. not for every object), we select an object, and track its lifetime by keeping its id and checking from time to time if it's still alive.

We're using this approach instead of:

  • The FREEOBJ event => Reduced overhead, as we don't need to be called for every object (+ not needing to deal with corner cases of when FREEOBJ may not be called for an object)

  • WeakMap => Weakmap APIs are Ruby-level and need the GVL, and thus make it hard to use from low-level tracepoints and to avoid overhead by doing profiler work with the GVL released.

For our purposes, it would be OK if this API is not "official" -- e.g. if it's one of those that gets exposed as a public symbol but not documented and no promises made for future Ruby releases.

Updated by peterzhu2118 (Peter Zhu) 25 minutes ago Actions #1 [ruby-core:123964]

Hi, author of rb_gc_mark_weak here.

I think it would be good to have such an API available. However, I don't think the current API is it. This is because the API is very tricky to use with incremental marking. Since incremental marking splits marking into several steps interleaved with Ruby code execution, it's possible that the state of the object changes after it has been marked. But since rb_gc_mark_weak operates on pointers, the underlying memory of the pointers may have been freed or realloced. This is why rb_gc_remove_weak exists and also why the ST tables in WeakMap/WeakKeyMap is an ST table that has keys and values that point to malloc memory containing the actual keys and values.

I have proposed #21084 (implemented in this PR) that I think may be an easier to use API.

Updated by ivoanjo (Ivo Anjo) 12 minutes ago Actions #2 [ruby-core:123965]

Thanks for the hint, Peter! I had not spotted https://bugs.ruby-lang.org/issues/21084 :)

Based on your notes it looks like exposing a rb_gc_declare_weak_references is a much better choice 👍

Actions

Also available in: PDF Atom