Project

General

Profile

Actions

Bug #21375

open

Set[] does not call #initialize

Added by Ethan (Ethan -) 2 days ago. Updated 1 day ago.

Status:
Open
Assignee:
-
Target version:
-
ruby -v:
ruby 3.5.0dev (2025-05-26T17:42:35Z master 909a0daab6) +PRISM [x86_64-darwin22]
[ruby-core:122303]

Description

I have a subclass of Set that overrides #initialize. Following #21216, .new does call #initialize but .[] does not.

class MySet < Set
  def initialize(enum = nil)
    compare_by_identity
    super
  end
end

MySet.new.compare_by_identity?
# => true
MySet[].compare_by_identity?
# => false

Updated by nobu (Nobuyoshi Nakada) 2 days ago

I'm not sure if it is intentional or not.
If unintentional, this patch may fix it.

diff --git a/set.c b/set.c
index 8676c62cd35..c41781c446f 100644
--- a/set.c
+++ b/set.c
@@ -409,13 +409,7 @@ static VALUE
 set_s_create(int argc, VALUE *argv, VALUE klass)
 {
     VALUE set = set_alloc_with_size(klass, argc);
-    set_table *table = RSET_TABLE(set);
-    int i;
-
-    for (i=0; i < argc; i++) {
-        set_table_insert_wb(table, set, argv[i], NULL);
-    }
-
+    rb_obj_call_init(set, argc, argv);
     return set;
 }
 

Updated by jeremyevans0 (Jeremy Evans) 2 days ago

nobu (Nobuyoshi Nakada) wrote in #note-1:

I'm not sure if it is intentional or not.

It was intentional when I developed core Set. Set.[] takes different arguments than Set#initialize.

Array.[] does not call Array#initialize, and Hash.[] does not call Hash#initialize, so there is precedent for the core collection classes to operate this way.

Set.[] was not documented as calling Set#initialize previously, and there were no tests or specs for it. Relying on Set.[] calling Set#initialize was relying on undefined behavior.

If unintentional, this patch may fix it.

diff --git a/set.c b/set.c
index 8676c62cd35..c41781c446f 100644
--- a/set.c
+++ b/set.c
@@ -409,13 +409,7 @@ static VALUE
 set_s_create(int argc, VALUE *argv, VALUE klass)
 {
     VALUE set = set_alloc_with_size(klass, argc);
-    set_table *table = RSET_TABLE(set);
-    int i;
-
-    for (i=0; i < argc; i++) {
-        set_table_insert_wb(table, set, argv[i], NULL);
-    }
-
+    rb_obj_call_init(set, argc, argv);
     return set;
 }
 

I'm not sure whether this is correct. #initialize arguments are enumerables of elements, .[] arguments are elements. I think if we wanted to call #initialize, we would have to allocate an array for the elements, and pass that, which would reduce performance.

Updated by Ethan (Ethan -) 1 day ago

It seems like a regression to me. I mean, it broke my code - maybe I'm alone, I can't say whether other people override #initialize and expect Set[] to call it, but it seems like a reasonable assumption to make, particularly since it always has done that. I wasn't aware that Array[]/Hash[] don't call initialize and I find that counterintuitive, and would advocate against changing other classes/constructors to do that.

Set.[] was not documented as calling Set#initialize previously

My feeling is that the base expectation is that #initialize is called on new objects. I know constructors can allocate and do things without #initialize, but I think that is quite rare. Maybe I'm wrong, I just learned Array[] and Hash[] do that, but I've almost never seen it.

I'll also note that the effects of not calling #initialize are easy to miss, since the set will still be a perfectly functional set, but will have lost whatever checks or changes are intended on initialization. At least in my case this was not straightforward to trace back to this change, it took me a while in the debugger to figure out that some of my sets were unexpectedly no longer compare_by_identity as my #initialize sets.

Having figured that out it is of course trivial for me to override MySet.[], but I'd rather Set behave like it used to, especially if that would save other people having to debug subtle bugs and fix.

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0