Bug #15027
closedWhen Struct#each method is overriden Struct#select and Struct#to_a use wrong collections
Description
Bug¶
Here's the code snippet that should reproduce the problem:
class Foo < Struct.new(:bar)
def each(&block)
[:baz, :qux].each(&block)
end
end
foo = Foo.new(:foo)
foo.map(&:itself) # => [:baz, :qux] # OK
foo.to_a # => [:foo] # NOT OK, expected [:baz, :qux]
foo.select(&:itself) # => [:foo] # NOT OK, expected [:baz, :qux]
As you can see, even tho we defined another collection for use by overriding #each
, the to_a
and select
still use Struct
's original collection.
The problem seem to be with Struct#to_a
and Struct#select
methods from struct.c
file that are defined unnecessarily.
Proposed solution¶
The attached solution simply deletes Struct#select
and Struct#to_a
. A couple tests are added to show everything still works as before.
Please let me know if I can provide any more info and I'll be ready to do so.
Files
Updated by matz (Yukihiro Matsumoto) about 6 years ago
- Status changed from Open to Feedback
The proposed behavior is more consistent but slower. I am not sure it's a good idea to hinder performance by supporting consistency in the rare case. Any opinion?
Matz.
Updated by jeremyevans0 (Jeremy Evans) about 6 years ago
I'm against changing the current behavior. If you can override each
, you can override other methods. If we wanted to be consistent about making this change, we would probably need to remove most the following additional core methods so that the Enumerable implementation would be used:
Array: any?, collect, count, cycle, drop, drop_while, find_index, first, include?, map, max, min, reject, reverse_each, select, sort, sum, take, take_while, to_a, to_h, uniq, zip
Hash: any?, include?, member?, to_a, to_h
Range: first, include?, max, member?, min
Struct: to_h
Above results generated with the following code (which includes some additional classes and methods):
classes = []
meths = Enumerable.instance_methods(false) - [:each]
ObjectSpace.each_object(Class){|c| classes << c if c < Enumerable && c.name}
classes.sort_by(&:name).each do |c|
same = c.instance_methods(false) & meths
next if same.empty?
puts "#{c}: #{same.sort.join(', ')}"
end
I'm guessing the standard library would need additional changes (e.g. set).
If anyone really wants the default Enumerable behavior for Struct:
Struct.remove_method(:to_a, :to_h, :select)
Updated by shevegen (Robert A. Heiler) about 6 years ago
Since matz asked for feedback, just a comment - I have no particular pro or con opinion
per se, mostly because I very rarely use the Struct/OpenStruct family; I usually just end
up writing a "real" class instead and adapt it to what is necessary. The only suggestion
I would like to make in regards to Struct is to mention this in the documentation of Struct (
https://ruby-doc.org/core-2.5.1/Struct.html ) so that it may not surprise ruby users who
use Struct; could also mention Jeremy's example of using the default Enumerable behaviour
for Struct.
Updated by Hanmac (Hans Mackowiak) about 6 years ago
as a middle way, can't we just do the "is overwritten by user" check?
i think i have seen it on other classes like Array or Hash
so it checks if the each
method is overwritten, in that case, call the Enumerable
method,
if not, call the Struct
own one.
Updated by nobu (Nobuyoshi Nakada) about 6 years ago
diff --git a/struct.c b/struct.c
index 7de46980aa..e4c875b5be 100644
--- a/struct.c
+++ b/struct.c
@@ -860,6 +860,9 @@ rb_struct_inspect(VALUE s)
static VALUE
rb_struct_to_a(VALUE s)
{
+ if (!rb_method_basic_definition_p(CLASS_OF(s), idEach)) {
+ return rb_call_super(0, 0);
+ }
return rb_ary_new4(RSTRUCT_LEN(s), RSTRUCT_CONST_PTR(s));
}
@@ -1077,6 +1080,9 @@ rb_struct_select(int argc, VALUE *argv, VALUE s)
rb_check_arity(argc, 0, 0);
RETURN_SIZED_ENUMERATOR(s, 0, 0, struct_enum_size);
+ if (!rb_method_basic_definition_p(CLASS_OF(s), idEach)) {
+ return rb_call_super(argc, argv);
+ }
result = rb_ary_new();
for (i = 0; i < RSTRUCT_LEN(s); i++) {
if (RTEST(rb_yield(RSTRUCT_GET(s, i)))) {
Updated by bruno (Bruno Sutic) about 6 years ago
Thank you for the code @nobu (Nobuyoshi Nakada). I have applied Nobu's suggestion, please find updated patch in the attach.
If you can override each, you can override other methods.
This is true, but doing this is very un-ruby-ish. Also, I don't think an average ruby developer will be patient enough to investigate why is this happening. A more realistic scenario is that a developer will just implement a custom #select
or #to_a
method.
Updated by Hanmac (Hans Mackowiak) about 6 years ago
@matz (Yukihiro Matsumoto): is the patch from Nobu good enough?
Updated by bruno (Bruno Sutic) about 6 years ago
Hi,
let me know if you have any further feedback on this one?
Updated by mame (Yusuke Endoh) almost 5 years ago
- Status changed from Feedback to Assigned
- Assignee set to matz (Yukihiro Matsumoto)
Matz, do you accept the suggested patch?
To be honest, I'm negative against this patch. It brings false consistency. The current behavior is very easy to explain; Struct#select
is just overridden. A redefinition check is a magic.
Updated by Dan0042 (Daniel DeLorme) almost 5 years ago
In the case that a method is overridden only for optimization's sake, the override check is simply the equivalent of deoptimizing when a required constraint doesn't hold. This is pretty common in a VM right?
Updated by mame (Yusuke Endoh) almost 5 years ago
In general, optimization should be invisible from users (except performance). This kind of redefinition checks in VM are carefully designed to be invisible.
Apparently, Struct#select
is defined for the sake of performance improvement. But IMO, it is not an optimization because it is still visible from users, even after this hack is introduced.
class Foo < Struct.new(:bar)
end
class Bar
include Enumerable
end
Bar.instance_method(:select).bind_call([:ok]) {|x| p x } #=> :ok
Foo.instance_method(:select).bind_call([:ok]) {|x| p x }
#=> bind argument must be an instance of Struct (TypeError)
If there is no redefinition check, we can easily explain the current situation; it is a normal override.
That being said, I agree that the current behavior looks inconsistent. I don't like the patch, but I'm not against it.
Updated by matz (Yukihiro Matsumoto) about 3 years ago
- Status changed from Assigned to Rejected
The methods may be overridden in the subclass. The assumption in this report seems to be too much.
Rejected.
Matz.