Project

General

Profile

Actions

Feature #21852

open

New improved allocator function interface

Feature #21852: New improved allocator function interface

Added by byroot (Jean Boussier) 2 months ago. Updated 6 days ago.

Status:
Open
Assignee:
-
Target version:
-
[ruby-core:124634]

Description

When implementing native types with the TypedData API, You have to define an allocator function.
That function receive the class to allocate and is supposed to return a new instance.

/**
 * This is  the type of  functions that ruby calls  when trying to  allocate an
 * object.  It is  sometimes necessary to allocate extra memory  regions for an
 * object.  When you define a class that uses ::RTypedData, it is typically the
 * case.  On  such situations  define a function  of this type  and pass  it to
 * rb_define_alloc_func().
 *
 * @param[in]  klass  The class that this function is registered.
 * @return     A newly allocated instance of `klass`.
 */
typedef VALUE (*rb_alloc_func_t)(VALUE klass);

Current API shortcomings

There are a few limitations with the current API.

Hard to disallow .allocate without breaking #dup and #clone.

First, it is frequent for extensions to want to disable Class#allocate for their native types via rb_undef_alloc_func, as very often allowing uninitialized object would lead to bugs.

The problem with rb_undef_alloc_func is that the alloc func is also used internally by dup and clone, so most types that undefine the allocator also prevent object copy without necessarily realizing it.

If you want to both disable Class#allocate yet still allow copying, you need to entirely implement the #dup and #clone methods, which is non-trivial and very few types do. One notable exception is Binding, which has to implement these two methods: https://github.com/ruby/ruby/blob/bea48adbcacc29cce9536977e15ceba0d65c8a02/proc.c#L301-L326

This works for Ruby code, however it doesn't work with C-level rb_obj_dup(VALUE), as used by the Ractor logic to copy objects across ractors.
In the case of Binding we probably wouldn't allow it anyway, but for other types it may be a problem.

Can't support objects of variable width

When duping or cloning an object of variable width, you need access to the original object to be able to allocate the right slot size.

An example of that is Thread::Backtrace objects, as evidenced by [Bug #21818].

To support sending exception objects across ractors, we'd need to make rb_obj_dup() work for Thread::Backtrace, but to correctly duplicate a backtrace, the allocator needs to know the size.

Proposed new API

I'd like to propose a new API for defining allocators:

typedef VALUE (*rb_copy_alloc_func_t)(VALUE klass, VALUE other);

In addition to the class to allocate, the function also receives the instance to copy.
When called by Class#allocate, the other argument is set to Qundef. Example usage:

static VALUE
backtrace_alloc(VALUE klass, VALUE other)
{
    rb_backtrace_t *bt;
    if (UNDEF_P(other)) {
        // Regular alloc
        return TypedData_Make_Struct(klass, rb_backtrace_t, &backtrace_data_type, bt);
    }
    else {
        // Copy
        rb_backtrace_t *other_bt;
        TypedData_Get_Struct(other, rb_backtrace_t, &backtrace_data_type, other_bt);
        VALUE self = backtrace_alloc_capa(other_bt->backtrace_size, &bt);
        bt->backtrace_size = other_bt->backtrace_size;
        MEMCPY(bt->backtrace, other_bt->backtrace, rb_backtrace_location_t, other_bt->backtrace_size);
        return self;
    }
}

Backward compatibility

Older-style allocator can keep being supported as long as we wish.

The one backward potential compatibility concern is third party code that calls rb_alloc_func_t rb_get_alloc_func(VALUE klass);.
As its documentation suggest, there's not much valid use case for it, but regardless we can keep supporting it by returning
a "jump function". See copy_allocator_adapter: https://github.com/ruby/ruby/pull/15795/changes#diff-884a5a8a369ef1b4c7597e00aa65974cec8c5f54f25f03ad5d24848f64892869R1640-R1653

Opportunity for more changes?

I was discussing this new interface with @ko1 (Koichi Sasada) and it appears that the current allocator interface may also be a limitation for Ractors and Ractor local GC. i.e. it might be useful to let the allocator function know that we're copying from one Ractor to another.

But I know to little about Ractor local GC to make a proposition here, so I will let @ko1 (Koichi Sasada) make suggestions.

Implementation

I implemented this idea in https://github.com/ruby/ruby/pull/15795, to solve [Bug #21818].
It could remain a purely private API, but I think it would make sense to expose it.


Related issues 3 (1 open2 closed)

Related to Ruby - Bug #21818: Thread backtraces cannot be communicated over Ractor portsClosedractorActions
Related to Ruby - Feature #21963: A solution to completely avoid allocated-but-uninitialized objectsOpenActions
Related to Ruby - Bug #21267: respond_to check in Class#allocate is inconsistentClosedActions

Updated by Eregon (Benoit Daloze) 2 months ago Actions #1 [ruby-core:124649]

Are initialize_copy/initialize_dup/initialize_clone still called when using a copy_allocator?
In the backtrace_alloc example you seem to already do some copying in that function, which then makes it unclear where the copying should be done.

First, it is frequent for extensions to want to disable Class#allocate for their native types via rb_undef_alloc_func, as very often allowing uninitialized object would lead to bugs.

How about using rb_undef_method(klass, "allocate"); for such cases?
I think that's a simple solution and requires no changes.

Fundamentally there is a difference between an allocate method and an alloc function, new/dup/clone all require an alloc function, but they should not require (and they don't IIRC) an allocate method.

Updated by byroot (Jean Boussier) 2 months ago Actions #2 [ruby-core:124650]

Are initialize_copy/initialize_dup/initialize_clone still called when using a copy_allocator?

Yes, it is unchanged.

In the backtrace_alloc example you seem to already do some copying in that function, which then makes it unclear where the copying should be done.

Indeed. Technically it wouldn't be required, but I think it's more reliable to do it there than in initialize_copy as the later could e redefined and cause corruption.

How about using rb_undef_method(klass, "allocate"); for such cases?

It's a corner case, but that allows redefining it later on.

Updated by Eregon (Benoit Daloze) 2 months ago Actions #3 [ruby-core:124651]

byroot (Jean Boussier) wrote in #note-2:

Indeed. Technically it wouldn't be required, but I think it's more reliable to do it there than in initialize_copy as the later could be redefined and cause corruption.

That could cause leaks if copying state involves extra allocations though, as a previously-existing initialize_copy might allocate and just set the pointers, but not free that first copy done in the copy_allocator.
It makes the contract unclear about what is supposed to copy what.

I think we need to trust initialize_copy, or invent a new Ruby-level protocol for copying objects.

Inventing a new Ruby-level protocol for copying objects and for creation without uninitialized state would be great.
Some core classes already do this but then they typically don't support dup/clone.
It'd be great to have this in general, so one could write classes that never have to care about uninitialized state since there are no instances in that uninitialized state ever.
We'd have some method to do both allocation + initialization at once, and another method to create a copy + initialize it as one call.

How about using rb_undef_method(klass, "allocate"); for such cases?

It's a corner case, but that allows redefining it later on.

I don't think we need to worry about this corner case.
Such things are clearly violating internals of the class, and then they might as well rb_define_alloc_func() and break it too.
But I suppose for core classes there might be a point to try to not segfault in that case, mmh.

Maybe classes should have a flag for "allow/disallow Class#allocate" and then Class#allocate would check that?
There is already a check for singleton classes, so we could merge it with that check for free and just have singleton classes always set that flag to false.

Updated by byroot (Jean Boussier) 2 months ago Actions #4 [ruby-core:124652]

Inventing a new Ruby-level protocol for copying objects and for creation without uninitialized state would be great.

Yes, this is somewhat what this new allocator API does. It also solves the problem that the Ractor API must be able to clone objects but can hardly trust user defined initialize_copy methods.

Updated by Eregon (Benoit Daloze) 2 months ago Actions #5 [ruby-core:124653]

It only does it for classes defined in C, and if they do all state copying in copy_allocator.
I think this would be valuable to have for any class, i.e. also for classes defined in Ruby and not in C.

Updated by Eregon (Benoit Daloze) 18 days ago · Edited Actions #6 [ruby-core:125004]

Thinking more about this I think there should be a protocol or an easy way to avoid the allocated-but-uninitialized state completely, which is problematic for classes defined in C but also in Ruby (though Ruby-defined classes will typically raise instead of segfault in such cases, but still the error can be cryptic and it's bad to have to deal with potentially-uninitialized objects).

I think we would need only two things:

  • A method to allocate+initialize at once. This must be defined as a method of Class.
    • There is already Class#new, so overriding that seems the simplest.
  • A method to allocate+initialize_copy at once.
    • This could be either an instance method, or a class taking an existing object as argument.
    • Overriding Kernel#dup seems the easiest here, and maybe define Kernel#clone as dup + extra clone logic to simplify.

In code:

module Kernel
  def clone
    copy = dup
    # extra clone logic like freezing
  end
end

# Usage
class MyClassWithoutUninitializedState
  ALLOCATE = method(:allocate)
  private_const :ALLOCATE

  def self.new(*args)
    ALLOCATE.call.tap { it.send(:initialize) }
  end

  def dup
    ALLOCATE.call.tap { |copy| copy.send(:initialize_copy, self) }
  end

  # Could either:
  undef_method :allocate # but that still allow using e.g. Class.instance_method(:allocate).bind_call(MyClassWithoutUninitializedState)

  # Or
  remove_alloc_function # Same as rb_undef_alloc_func(), this would be a new method of Class
  # but that probably would break `ALLOCATE.call` above
end

And similar for classes defined in C.
In C it's easy to call the alloc function directly even when not registered as such on the Class.

The tricky part of classes defined in Ruby is how to call the alloc function in overridden new/dup and still not expose it to users.
So I think a better solution would be a new Class method that defines new and dup and clone while taking a snapshot of the current alloc_func and then removes the alloc_func, then that would work reliably.

Say Class#safe_initialization which does something equivalent to (but more performant):

class Class
  def safe_initialization
    alloc_function = rb_get_alloc_func(self)

    define_singleton_method(:new) do |...|
      alloc_function.call.tap { it.send(:initialize) }
    end

    define_method(:dup) do
      alloc_function.call.tap { |copy| copy.send(:initialize_dup, self) }
    end

    define_method(:clone) do
      alloc_function.call.tap { |copy| copy.send(:initialize_clone, self) }
      # extra clone logic
    end
    
    rb_undef_alloc_func(self)
    return self
  end
end

# Usage
class MyClassWithoutUninitializedState
  safe_initialization
end

# .new, .dup, .clone all work fine

MyClassWithoutUninitializedState.allocate # => exception
Class.instance_method(:allocate).bind_call(MyClassWithoutUninitializedState) # => exception

And that would work for both classes defined in Ruby and C.

The only methods which would see uninitialized instances would be initialize, initialize_copy, initialize_dup, initialize_clone, which seems reasonable and also unavoidable anyway
(I don't think anyone wants to e.g. write def self.new = ALLOCATE.call.tap { it.instance_variable_set(:@foo, 42); ... } instead of def initialize; @foo = 42; ...; end).

Updated by Eregon (Benoit Daloze) 18 days ago Actions #7 [ruby-core:125005]

Going further, Class#safe_initialization instead of redefining these 3 methods could just set a new internal_alloc_func field in RClass (only used by new/dup/clone and can never be read by anything else) + rb_undef_alloc_func().
And rb_define_alloc_func() (as long as the current alloc_func is not NULL, i.e. hasn't been removed) could set that field too so that new/dup/clone only need to read that field and not the "public alloc_func" field.

Updated by Eregon (Benoit Daloze) 18 days ago · Edited Actions #8 [ruby-core:125006]

I'm aware what I propose doesn't solve Can't support objects of variable width, i.e. the case for Thread::Backtrace.
But AFAIK variable width is only available for core classes (not as public API), and Thread::Backtrace is core too so could just define dup & clone to allocate the right size and use some internal functions to not have to reimplement the rest of dup/clone.
Even with #21853 the alloc size would be fixed (per class) so it wouldn't need rb_copy_alloc_func_t.

One clear downside of rb_copy_alloc_func_t is performance as it introduces an extra check for new/dup/clone to check whether to use a normal or copy alloc_func.

BTW Thread::Backtrace is AFAIK never exposed to users, so there might be more direct ways to deal with that.

Updated by Eregon (Benoit Daloze) 18 days ago Actions #9

  • Related to Bug #21818: Thread backtraces cannot be communicated over Ractor ports added

Updated by akr (Akira Tanaka) 13 days ago Actions #10 [ruby-core:125081]

The separation of allocation and initialization is essential to marshal cyclically referenced objects.

We can use Marshal to dump and load cyclically referenced objects:

x = []
y = [x]
x << y
z = Marshal.load(Marshal.dump(x))
p z #=> [[[...]]]

Marshal.load generates objects using allocate method and initialize them later as (pseudo Ruby code).

x = Array.allocate
y = Array.allocate
x << y
y << x

This needs that objects can be allocated without any information except the class.
So, it is not a good idea to blur the separation between allocation and initialization.
But the proposal merges them.

It is interesting that variable width allocations and Marshal.
I'm not sure that we can support variable width objects with marshal if such objects needs the width for allocation.

I guess byroot try to solve a problem with Ractor.
However, the problem is not explained well.
So, it's difficult to sympathize with the necessity.

Updated by byroot (Jean Boussier) 13 days ago Actions #11 [ruby-core:125082]

This needs that objects can be allocated without any information except the class.

This is unchanged.

But the proposal merges them.

Not really? There is a single function yes, but the function has to check for UNDEF_P(other) in all cases.
It's a single function because we have to store the function pointer on the class, so storing two pointers is complicated (we have no space spaces in RClass).

However, the problem is not explained well.

Please let me know what you don't understand and I will do my best to clarify it.

the crux of the problem is that Objects now have variable width, so when duping or cloning an object, the allocator need to be able to inspect the source object to know how large of a slot to allocate.

The linked PR does copy right after allocation, but this is in no way a requirement.

Updated by Eregon (Benoit Daloze) 10 days ago Actions #12 [ruby-core:125102]

From the dev meeting log
it seems other agreed there should be no copying done in backtrace_alloc, but instead done in initialize_copy.

For Thread::Backtrace the obvious fix seems to just define dup and clone, and that can reuse internal helper functions to avoid duplication, as said in https://bugs.ruby-lang.org/issues/21852#note-8.

I suppose the main motivation for this issue is to copy Embedded TypedData (#21853)?
If the Embedded TypedData has a fixed size (uses TypedData_Make_Struct()) then it should be no big issue because the size is fixed per rb_data_type_t.
But the docs also mention rb_data_typed_object_zalloc() and I guess that's the problem?
Could the .dsize member be used there perhaps?

Maybe another way to go is exposing in the C-API functions that do the same as dup & clone, but work on an already-allocated object.
Then the C extension could easily define dup & clone and just use the correct way to allocate, and it would even work with rb_undef_alloc_func().

Updated by Eregon (Benoit Daloze) 10 days ago Actions #13 [ruby-core:125103]

akr (Akira Tanaka) wrote in #note-10:

The separation of allocation and initialization is essential to marshal cyclically referenced objects.

Thank you for that, I didn't think about it.
That would then be only a small tweak for my idea in https://bugs.ruby-lang.org/issues/21852#note-7, where Marshal would also use internal_alloc_func.
I'll extract that idea to a separate proposal, it's mostly orthogonal to this issue anyway.

Updated by byroot (Jean Boussier) 10 days ago Actions #14 [ruby-core:125104]

I feel like I'm repeating the same thing in a loop, so I guess I'm bad at explaining? But here it goes:

it seems other agreed there should be no copying done in backtrace_alloc, but instead done in initialize_copy.

Already answered twice, I did it in my example PR but it's no obligation at all, perhaps I shouldn't have done it, but please forget about it. This is in no way the reason for the requested API.

For Thread::Backtrace the obvious fix seems to just define dup and clone, and that can reuse internal helper functions to avoid duplication, as said in https://bugs.ruby-lang.org/issues/21852#note-8.

Pretty sure I repeated that one many times too.

Ractors don't call #clone/#dup on objects (they can't trust it). It calls rb_obj_clone -> mutable_obj_clone:

static VALUE
mutable_obj_clone(VALUE obj, VALUE kwfreeze)
{
    VALUE clone = rb_obj_alloc(rb_obj_class(obj));
    return rb_obj_clone_setup(obj, clone, kwfreeze);
}

I suppose the main motivation for this issue is to copy Embedded TypedData (#21853)?

Yes, and types of dynamic slot size in general.

If the Embedded TypedData has a fixed size (uses TypedData_Make_Struct()) t

Yes, the happy path is handled fine by the current API. But even with TypedData_Make_Struct you can use the same rb_data_type_t with different struct hence different sizes. It's not very common, but I believe I've seen it in the past.

Could the .dsize member be used there perhaps?

No. dsize is meant to report all allocated memory, not just the main struct passed to TypedData_Make_Struct.
This also assume that clone/dup do want to allocate the exact same size as the source object. You could imagine types where this is the opportunity to recompact the object, similar to how Array#dup and other core types behave.

Then the C extension could easily define dup & clone and just use the correct way to allocate

This would require to free space in RClass for more pointer functions. We're out of space right now.
It's also a much larger API for not much gain IMO.

Updated by Eregon (Benoit Daloze) 9 days ago Actions #15

  • Related to Feature #21963: A solution to completely avoid allocated-but-uninitialized objects added

Updated by Eregon (Benoit Daloze) 8 days ago Actions #16

  • Related to Bug #21267: respond_to check in Class#allocate is inconsistent added

Updated by Eregon (Benoit Daloze) 6 days ago Actions #17 [ruby-core:125146]

I had a call with @byroot (Jean Boussier) and I understand better the proposal now.

I think the proposal makes sense, but with the small tweaks proposed in https://bugs.ruby-lang.org/issues/21963#note-8 it would completely avoid allocated-but-uninitialized objects, which I think would be game changer for classes defined in C (never need to worry about objects allocate-d but not initialize-d, and no need to check for it in every instance method).

Actions

Also available in: PDF Atom