Bug #21267: respond_to check in Class#allocate is inconsistent - Ruby - Ruby Issue Tracking System

Actions

Copy link

Bug #21267

closed

respond_to check in Class#allocate is inconsistent

Bug #21267: respond_to check in Class#allocate is inconsistent

Added by jhawthorn (John Hawthorn) 9 months ago. Updated 8 months ago.

Status:

Closed

Assignee:

Target version:

ruby -v:

Backport:

3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN

[ruby-core:121659]

Description

Class#allocate has an additional rb_obj_respond_to(klass, rb_intern("allocate")) check to forbid allocate being called on a class where it has been made private or undefined, even if used via ex. bind_call.

>> Rational.allocate
(irb):1:in '<main>': undefined method 'allocate' for class Rational (NoMethodError)
>> Class.instance_method(:allocate).bind_call(Rational)
(irb):1:in 'Class#allocate': calling Rational.allocate is prohibited (TypeError)

However I don't think this provides any additional protection from users accessing an uninitialized object, as the user can redefine allocate to anything to bypass the check:

>> Class.instance_method(:allocate).bind_call(Class.new(Rational) {def self.allocate; end})
=> (0/1)

Or even override respond_to_missing?

>> Class.instance_method(:allocate).bind_call(Class.new(Rational) {def self.respond_to_missing? *; true; end})
=> (0/1)

So I think we should remove this check. For classes that we need to forbid allocation we should use rb_undef_alloc_func.

The classes I see this used for are:

MatchData
Refinement
Module
Complex
Rational

My main motivation is that this check makes Class#allocate slow. There are ways we could improve that, but I don't think the check as-is is useful. If there's an alternative, more robust, check we'd like to do instead of simply removing this I'd be happy to implement it.

This makes allocate ~75% faster.

|allocate_no_params            |     19.009M|   20.087M|
|                              |           -|     1.06x|
|allocate_allocate             |     20.587M|   35.882M|
|                              |           -|     1.74x|

https://github.com/ruby/ruby/pull/13116

Updated by jhawthorn (John Hawthorn) 8 months ago Actions
Copy link
#1 [ruby-core:121820]

At RubyKaigi @byroot (Jean Boussier) and I looked at this and came up with a more efficient and resilient way to do the check (https://github.com/ruby/ruby/commit/a6365cef1084191f992ee55fa661cb9aa2196c39). But coming back to this I again don't think it's effective enough to prevent uninitialized values and we should just remove the check.

Again, this check is trying to avoid users being able to make an uninitialized value of these classes, even if they are trying very hard

>> Class.instance_method(:allocate).bind_call(MatchData)
(irb):6:in 'Class#allocate': calling MatchData.allocate is prohibited (TypeError)

However we aren't protecting new with this check, and can get uninitialized values that way (even though it has been undefined in the same way as allocate):

>> Class.instance_method(:new).bind_call(MatchData)
An error occurred when inspecting the object: #<TypeError: uninitialized MatchData>
Result of Kernel#inspect: #<MatchData:0x00007387517150b0>

We thought the reason for this special check rather than using rb_undef_alloc_func is so that dup/clone can still work, but that itself is problematic and a way users can get an uninitialized value.

>> match = "a".match(/./)
=> #<MatchData "a">
>> match.clone # wanting this to work is why we don't rb_undef_alloc_func (I think)
=> #<MatchData "a">
>> def match.initialize_copy(x); end
=> :initialize_copy
>> match.clone # now this makes an uninitialized object
An error occurred when inspecting the object: #<TypeError: uninitialized MatchData>
Result of Kernel#inspect: #<MatchData:0x0000738752cfb370>

So I am back to believing we should remove the rb_obj_respond_to check, because it doesn't provide any safety. It seems like our only options are to tolerate uninitialized objects or use rb_undef_alloc_func.

Updated by Eregon (Benoit Daloze) 8 months ago Actions
Copy link
#2 [ruby-core:121825]

I think it makes sense to replace or remove that "incorrect" check given it doesn't apply to new.
The proper fix seems to also do the correct check in new.

I think it's best to keep the possibility for 3rd party C extensions to undefine allocate (to prevent creating uninitialized objects) and still allow dup/clone (which should work fine if people don't monkey-patch initialize_{copy,clone,dup}, if they do the segfault/error is their responsibility then).
With this change there is no protection in place left, so Class.instance_method(:allocate).bind_call(MyType) could segfault, even if MyType undefines allocate.

Looking at the PR, how about setting alloc_prohibited = true when undefining allocate? (and set it back to false if it's defined again later)

Another way would be to have separate allocation paths for "new object" and "copy" but this requires much more thought.
Or prevent this case of "no new/allocate but dup/clone", but then for core types too for consistency (maybe too incompatible? maybe not?).

Or core types could actually define dup & clone as overridden methods, and not use the alloc function at all and so use rb_undef_alloc_func().
That seems the cleanest actually, WDYT?

it doesn't provide any safety

Well, using bind_call is not rare and might even do it accidentally, OTOH defining a singleton initialize_copy/allocate method is pretty hardcore, and that should be no surprise it blows things up.

Updated by jhawthorn (John Hawthorn) 8 months ago Actions
Copy link
#3 [ruby-core:121899]

Or core types could actually define dup & clone as overridden methods, and not use the alloc function at all and so use rb_undef_alloc_func().
That seems the cleanest actually, WDYT?

Yes. The only options that work are to rb_undef_alloc_func or returning something safe from the alloc func. The alloc_prohibited approach we tried is a dead end.

As far as I can tell in current Ruby uninitialized MatchData, Refinement, Module, Complex, Rational do not crash the VM (if they do we need to fix that, because again, users can access that today). Which is why we should remove it. If we want to adjust those classes to rb_undef_alloc_func that seems like a good idea and we can do that later.

Updated by Eregon (Benoit Daloze) 8 months ago Actions
Copy link
#4 [ruby-core:121913]

Right.
I think it's quite valuable to make those rb_undef_alloc_func to avoid having to handle uninitialized objects because that means extra checks (performance overhead) and less optimizations.
For example if the Class#superclass internal field is not guaranteed to be constant, a JIT/VM might be more limited in what it can cache (or has to introduce an extra check to handle the uninitialized Class case).

TruffleRuby already prevents Class.allocate for this reason. It'd be nice to do it for the other types you mentioned.
Compatibility issues should be rare because it doesn't make much sense to allocate and initialize those types separately, nor to subclass them.

Updated by jhawthorn (John Hawthorn) 8 months ago Actions
Copy link
#5 [ruby-core:122293]

Status changed from Open to Closed

I missed tagging this issue but the check was removed by https://github.com/ruby/ruby/commit/b0502e8f904f155ac077014077431ed5e54b9996

Actions

Copy link

Also available in: PDF Atom

Project

General

Profile

Ruby

Custom queries

Bug #21267

respond_to check in Class#allocate is inconsistent

Updated by jhawthorn (John Hawthorn) 8 months ago Actions
Copy link
#1 [ruby-core:121820]

Updated by Eregon (Benoit Daloze) 8 months ago Actions
Copy link
#2 [ruby-core:121825]

Updated by jhawthorn (John Hawthorn) 8 months ago Actions
Copy link
#3 [ruby-core:121899]

Updated by Eregon (Benoit Daloze) 8 months ago Actions
Copy link
#4 [ruby-core:121913]

Updated by jhawthorn (John Hawthorn) 8 months ago Actions
Copy link
#5 [ruby-core:122293]

Project

General

Profile

Ruby

Custom queries

Bug #21267

respond_to check in Class#allocate is inconsistent

Updated by jhawthorn (John Hawthorn) 8 months ago ActionsCopy link #1 [ruby-core:121820]

Updated by Eregon (Benoit Daloze) 8 months ago ActionsCopy link #2 [ruby-core:121825]

Updated by jhawthorn (John Hawthorn) 8 months ago ActionsCopy link #3 [ruby-core:121899]

Updated by Eregon (Benoit Daloze) 8 months ago ActionsCopy link #4 [ruby-core:121913]

Updated by jhawthorn (John Hawthorn) 8 months ago ActionsCopy link #5 [ruby-core:122293]

Updated by jhawthorn (John Hawthorn) 8 months ago Actions
Copy link
#1 [ruby-core:121820]

Updated by Eregon (Benoit Daloze) 8 months ago Actions
Copy link
#2 [ruby-core:121825]

Updated by jhawthorn (John Hawthorn) 8 months ago Actions
Copy link
#3 [ruby-core:121899]

Updated by Eregon (Benoit Daloze) 8 months ago Actions
Copy link
#4 [ruby-core:121913]

Updated by jhawthorn (John Hawthorn) 8 months ago Actions
Copy link
#5 [ruby-core:122293]