Feature #21852
openNew improved allocator function interface
Description
When implementing native types with the TypedData API, You have to define an allocator function.
That function receive the class to allocate and is supposed to return a new instance.
/**
* This is the type of functions that ruby calls when trying to allocate an
* object. It is sometimes necessary to allocate extra memory regions for an
* object. When you define a class that uses ::RTypedData, it is typically the
* case. On such situations define a function of this type and pass it to
* rb_define_alloc_func().
*
* @param[in] klass The class that this function is registered.
* @return A newly allocated instance of `klass`.
*/
typedef VALUE (*rb_alloc_func_t)(VALUE klass);
Current API shortcomings¶
There are a few limitations with the current API.
Hard to disallow .allocate without breaking #dup and #clone.¶
First, it is frequent for extensions to want to disable Class#allocate for their native types via rb_undef_alloc_func, as very often allowing uninitialized object would lead to bugs.
The problem with rb_undef_alloc_func is that the alloc func is also used internally by dup and clone, so most types that undefine the allocator also prevent object copy without necessarily realizing it.
If you want to both disable Class#allocate yet still allow copying, you need to entirely implement the #dup and #clone methods, which is non-trivial and very few types do. One notable exception is Binding, which has to implement these two methods: https://github.com/ruby/ruby/blob/bea48adbcacc29cce9536977e15ceba0d65c8a02/proc.c#L301-L326
This works for Ruby code, however it doesn't work with C-level rb_obj_dup(VALUE), as used by the Ractor logic to copy objects across ractors.
In the case of Binding we probably wouldn't allow it anyway, but for other types it may be a problem.
Can't support objects of variable width¶
When duping or cloning an object of variable width, you need access to the original object to be able to allocate the right slot size.
An example of that is Thread::Backtrace objects, as evidenced by [Bug #21818].
To support sending exception objects across ractors, we'd need to make rb_obj_dup() work for Thread::Backtrace, but to correctly duplicate a backtrace, the allocator needs to know the size.
Proposed new API¶
I'd like to propose a new API for defining allocators:
typedef VALUE (*rb_copy_alloc_func_t)(VALUE klass, VALUE other);
In addition to the class to allocate, the function also receives the instance to copy.
When called by Class#allocate, the other argument is set to Qundef. Example usage:
static VALUE
backtrace_alloc(VALUE klass, VALUE other)
{
rb_backtrace_t *bt;
if (UNDEF_P(other)) {
// Regular alloc
return TypedData_Make_Struct(klass, rb_backtrace_t, &backtrace_data_type, bt);
}
else {
// Copy
rb_backtrace_t *other_bt;
TypedData_Get_Struct(other, rb_backtrace_t, &backtrace_data_type, other_bt);
VALUE self = backtrace_alloc_capa(other_bt->backtrace_size, &bt);
bt->backtrace_size = other_bt->backtrace_size;
MEMCPY(bt->backtrace, other_bt->backtrace, rb_backtrace_location_t, other_bt->backtrace_size);
return self;
}
}
Backward compatibility¶
Older-style allocator can keep being supported as long as we wish.
The one backward potential compatibility concern is third party code that calls rb_alloc_func_t rb_get_alloc_func(VALUE klass);.
As its documentation suggest, there's not much valid use case for it, but regardless we can keep supporting it by returning
a "jump function". See copy_allocator_adapter: https://github.com/ruby/ruby/pull/15795/changes#diff-884a5a8a369ef1b4c7597e00aa65974cec8c5f54f25f03ad5d24848f64892869R1640-R1653
Opportunity for more changes?¶
I was discussing this new interface with @ko1 (Koichi Sasada) and it appears that the current allocator interface may also be a limitation for Ractors and Ractor local GC. i.e. it might be useful to let the allocator function know that we're copying from one Ractor to another.
But I know to little about Ractor local GC to make a proposition here, so I will let @ko1 (Koichi Sasada) make suggestions.
Implementation¶
I implemented this idea in https://github.com/ruby/ruby/pull/15795, to solve [Bug #21818].
It could remain a purely private API, but I think it would make sense to expose it.