Bug #17048
closedCalling initialize_copy on live modules leads to crashes
Added by alanwu (Alan Wu) over 4 years ago. Updated about 3 years ago.
Description
Here's a repro script
loop do
m = Module.new do
prepend Module.new
def hello
end
end
klass = Class.new { include m }
m.send(:initialize_copy, Module.new)
GC.start
klass.new.hello rescue nil
end
Here's a script that shows that it has broken semantics even
when it happens to not crash.
module A
end
class B
include A
end
module C
Const = :C
end
module D
Const = :D
end
A.send(:initialize_copy, C)
p B::Const # :C, makes sense
A.send(:initialize_copy, D)
p B::Const # :D, makes sense
A.send(:initialize_copy, Module.new)
p (begin B::Const rescue NameError; 'NameError' end) # NameError, makes sense
A.send(:initialize_copy, C)
p B::Const # still NameErorr. Weird
This example shows that the problem exists as far back as 2.0.0.
I think the easiest way to fix this is to forbid calling :initialize_copy
on modules that have children. Another option is to try to decide on
the semantics of this. Though I don't think it's worth the effort as this
has been broken for a long time and people don't seem to to be using it.
Thoughts?
Updated by jeremyevans0 (Jeremy Evans) over 4 years ago
Removing Module#initialize_copy
would probably require changes to Module#{dup,clone}
. Maybe we can set a flag in Module#initialize_copy
such that calling the method raises if called again on the same module (not sure if that is what you meant)?
Updated by alanwu (Alan Wu) over 4 years ago
I'm not proposing that we remove initialize_copy
, but to make it raise when the receiver has children. So this is what it would look like:
module A
end
A.send(:initialize_copy, Module.new) # fine, no one inherits from A
class B
include A
A.send(:initialize_copy, Module.new) # raise, since A now has one child
end
Admittedly this proposed behavior is designed to sidestep the problem that the current implementation has.
Though maybe it's fine since this is a seldom used private method and other Ruby implementations don't have to follow this behavior?
Now that you've mentioned it, removing the method and moving the logic into Module#{dup,clone}
feels like the cleanest solution. It's more likely to be a breaking change though.
Updated by jeremyevans0 (Jeremy Evans) over 4 years ago
I would recommend to have initialize_copy
always raise instead of only raising if the module has children. It's more consistent, and there is no reason anyone should be calling initialize_copy
more than once. This shouldn't be considered a breaking change, as initialize_copy
is a private method. I think it is better than moving the logic to dup
/clone
, just in case a Module subclass is overridding initialize_copy
and calling super.
Updated by alanwu (Alan Wu) over 4 years ago
In principle I agree that allowing initialization only once is a good way to go, but the way Module.allocate
is currently setup makes implementing this a bit complicated. At the moment there is not really a pre-init state for modules and the result from Module.allocate
is the same as Module.new
. If we want to do this we would have to implement a pre-init state for modules, and make sure that all the operations on modules (adding methods, constants, etc) are aware of this state so they can do the initialization in case they receive a pre-init module.
It's doable, but I don't know if the extra consistency is worth the added complexity. It would slow things down a small bit for operations like method addition and has the potential to introduce crashes if we miss places where we need the initialization check.
Updated by nobu (Nobuyoshi Nakada) over 4 years ago
I agree with @alanwu (Alan Wu), that it won't be worth.
diff --git c/class.c i/class.c
index 6835d2d7289..f7a56601634 100644
--- c/class.c
+++ i/class.c
@@ -354,6 +354,13 @@ static void ensure_origin(VALUE klass);
VALUE
rb_mod_init_copy(VALUE clone, VALUE orig)
{
+ if (RCLASS_EXT(clone)->subclasses ||
+ RCLASS_EXT(clone)->parent_subclasses ||
+ RCLASS_EXT(clone)->module_subclasses) {
+ rb_raise(rb_eTypeError, "cannot replace %s in use",
+ (RB_TYPE_P(clone, T_MODULE) ? "module" : "class"));
+ }
+
/* cloned flag is refer at constant inline cache
* see vm_get_const_key_cref() in vm_insnhelper.c
*/
diff --git c/test/ruby/test_module.rb i/test/ruby/test_module.rb
index d2da384cbd1..8d986f13413 100644
--- c/test/ruby/test_module.rb
+++ i/test/ruby/test_module.rb
@@ -435,6 +435,12 @@
assert_empty(m.constants, bug9813)
end
+ def test_initialize_copy_in_use
+ m = Module.new
+ Class.new {include m}
+ assert_raise(TypeError) {m.send(:initialize_copy, Module.new)}
+ end
+
def test_dup
OtherSetup.call
Updated by Eregon (Benoit Daloze) over 4 years ago
Should we rather design a good way to allow copying but yet not have to deal with uninitialized state?
Right now the only well-defined protocols for copying are
dup = allocate, copy @ivars, initialize_dup (which calls initialize_copy)
clone = allocate, copy @ivars, initialize_clone (which calls initialize_copy), clone also copies extra state like frozen and the singleton class
This means some classes have to support an "unintialized state" when otherwise they would not need to.
And in some cases it even means instances have to be mutable when they would otherwise not need to (e.g., MatchData, #16294).
So maybe we should make Module.allocate and #initialize_copy always raise, and override dup
and clone
for Module?
It's still unfortunate that this would mean duplicating the logic for dup/clone there.
So I think a better copying protocol is warranted here as it's not just an issue for Module.
Re @nobu's patch I don't like this ad-hoc condition which leaks implementation details into semantics.
How about having an initialized
flag that's set by #initialize
and #initialize_copy
and checked in both of these methods if we want a quick fix?
Updated by alanwu (Alan Wu) over 4 years ago
How about having an initialized flag that's set by #initialize and #initialize_copy and checked in both of these methods if we want a quick fix?
That doesn't work because you can trigger the bug without ever calling initialize
on the module:
m = Module.allocate
m.prepend(Module.allocate)
m.define_method(:hello) {}
klass = Class.new { include m }
m.send(:initialize_copy, Module.allocate)
GC.start
klass.new.hello rescue nil
# you may need to run this multiple times to get it to crash
If we want something like that we would have to implement an uninitialized state.
Updated by nobu (Nobuyoshi Nakada) over 4 years ago
Updated by Eregon (Benoit Daloze) over 4 years ago
nobu (Nobuyoshi Nakada) wrote in #note-10:
Tried it.
https://github.com/nobu/ruby/tree/uninitialized-module
Right, removing Module.allocate
is one way since dup
/clone
uses the alloc function directly (and does not call allocate
).
I think calling Module.allocate
in user code makes no sense, so that approach seems a good way to me.
Updated by alanwu (Alan Wu) over 4 years ago
Thank you for the code, @nobu (Nobuyoshi Nakada) . I think with your branch we could even keep .allocate
, though people wouldn't be able to do much with it.
As long as no one is able to call initialize_copy
after children (iclasses) exist, it's fine.
I think I was wrong about the number of places we would have to plug to implement an uninitialized state that resolves the issue.
Only the places that make new iclasses need to check for the uninitilaized state, so jsut prepend
, include
and maybe refinements.
Side note about the branch (57c7f9b), it's possible to get access to an uninitialized module in Ruby land by subclassing from Module
:
class Sub < Module
def initialize_copy(other)
p ancestors
end
end
Sub.new.dup # [#<Sub:0x00007fa4ec015b10>, BasicObject]
It doesn't cause anything bad to happen AFAICT. I just found it interesting that the branch adds a normally impossible-to-construct module.
Maybe it's a positive because it makes Ruby more weird :D
Updated by ko1 (Koichi Sasada) over 4 years ago
sorry I didn't check all threads, but nobu's patch can close it?
Updated by alanwu (Alan Wu) about 4 years ago
Yes, nobu's patch fixes the crash. It is technically a breaking change though, so maybe it needs approval from Matz?
Side note, the bug still exists on master as of today.
Updated by jeremyevans0 (Jeremy Evans) over 3 years ago
This example in the original post has been fixed recently in the master branch. I bisected the fixing commit to 3dd3ea092acead6179033f2c95525ffc5b8bb6ff. However, as that changes constant lookup order, I'm guessing we aren't going to want to backport it.
Ruby still allows calling Module.allocate
and calling Module#initialize_copy
on modules with subclasses. And the example in #note-9 still can cause an occasional hang in my environment. So I'm going to leave this open.
Updated by nobu (Nobuyoshi Nakada) about 3 years ago
As far as I tested it with master
slightly, it didn't reproduce.
Does it still happen, and how occational?
Updated by jeremyevans0 (Jeremy Evans) about 3 years ago
nobu (Nobuyoshi Nakada) wrote in #note-16:
As far as I tested it with
master
slightly, it didn't reproduce.
Does it still happen, and how occational?
It still happens in my environment. I tested with 1000 iterations, and it hung on the 14th iteration:
$ for x in `jot 1000`; do echo -n .; ./ruby 17048-note9.rb; done
..............
Updated by alanwu (Alan Wu) about 3 years ago
Here a repro that that always crashes for me on cbbda3e:
mod = Module.new { define_method(:foo) {:first} }
klass = Class.new { include mod }
instance = klass.new
p instance.foo
new_mod = Module.new { define_method(:foo) { :second } }
mod.send(:initialize_copy, new_mod)
4.times { GC.start }
p instance.foo # [BUG] unreachable
Updated by nobu (Nobuyoshi Nakada) about 3 years ago
- Status changed from Open to Closed
Applied in changeset git|178ee1e801acb33d13b3e8a630f6ca4926c68fbc.
Already initialized modules cannot be replaced [Bug #17048]