Bug #14742
closedDeadlock when autoloading different constants in the same file from multiple threads
Description
The following example deadlocks consistently:
# a.rb
autoload :Foo, __dir__ + "/b"
autoload :Bar, __dir__ + "/b"
t1 = Thread.new { Foo }
t2 = Thread.new { Bar }
t1.join
t2.join
# b.rb
module Foo; end
module Bar; end
$ ruby a.rb
Traceback (most recent call last):
1: from a.rb:7:in `<main>'
a.rb:7:in `join': No live threads left. Deadlock? (fatal)
3 threads, 3 sleeps current:0x00007ffa817ae680 main thread:0x00007ffa8140bf30
* #<Thread:0x00007ffa818797f8 sleep_forever>
rb_thread_t:0x00007ffa8140bf30 native:0x00007fff9b3fa380 int:1
a.rb:7:in `join'
a.rb:7:in `<main>'
* #<Thread:0x00007ffa830379a8@a.rb:4 sleep_forever>
rb_thread_t:0x00007ffa817ace60 native:0x0000700007e7b000 int:0 mutex:0x00007ffa817ae680 cond:1
depended by: tb_thread_id:0x00007ffa8140bf30
/Users/eugene/.rbenv/versions/2.6.0-dev/lib/ruby/2.6.0/rubygems/core_ext/kernel_require.rb:59:in `require'
/Users/eugene/.rbenv/versions/2.6.0-dev/lib/ruby/2.6.0/rubygems/core_ext/kernel_require.rb:59:in `require'
a.rb:4:in `block in <main>'
* #<Thread:0x00007ffa830372c8@a.rb:5 sleep_forever>
rb_thread_t:0x00007ffa817ae680 native:0x0000700007f7e000 int:0
/Users/eugene/src/autoload_thread_safety_test/b.rb:1:in `<top (required)>'
/Users/eugene/.rbenv/versions/2.6.0-dev/lib/ruby/2.6.0/rubygems/core_ext/kernel_require.rb:59:in `require'
/Users/eugene/.rbenv/versions/2.6.0-dev/lib/ruby/2.6.0/rubygems/core_ext/kernel_require.rb:59:in `require'
a.rb:5:in `block in <main>'
Updated by normalperson (Eric Wong) over 6 years ago
Investigating
Updated by normalperson (Eric Wong) over 6 years ago
Yes, this is an old bug and not caused by my changes, I guess.
It requires reworking some internal data structures in
variable.c to provide feature mapping to autoload_data_i
so different autoloads referring to the same file (aka "feature")
can safely wait on each other.
Will work on this tomorrow, I need sleep :<
Updated by normalperson (Eric Wong) over 6 years ago
- Status changed from Open to Closed
Applied in changeset trunk|r63387.
variable.c: fix multiple autoload with identical file
We need to ensure autoload declarations pointing to the same
feature (aka "file") can wait on each other to avoid deadlock
situations.
So, reorganize autoload data structures to maintain a
feature => autoload_data_i mapping, and have module constant
tables point to the new autoload_const struct instead of
directly to autoload_data_i. This allows multiple
autoload_const structs to refer to the SAME autoload_data_i
struct, and with it, the on-stack autoload_state.waitq.
The end result is different constants can share the same waitq
(tied to the feature name), and not deadlock each other during
loading.
Thanks to Eugene Kenny for the bug report and reproducible test case.
Reported-by: Eugene Kenny elkenny@gmail.com
- variable.c (autoload_featuremap): new global
(struct autoload_const): new per-const struct
(struct autoload_state): reference autoload_const instead of autoload_data_i
(struct autoload_data_i): remove per-const
(autoload_i_mark): delete from autoload_featuremap if unreferenced
(autoload_c_mark): new dmark callback
(autoload_c_free): new dfree callback
(autoload_c_memsize): new memsize callback
(autoload_const_type): new data type
(get_autoload_data): set autoload_const as well
(rb_autoload_str): use new data structures
(autoload_delete): cleanup from autoload_featuremap
(check_autoload_required): adjust for new internals
(rb_autoloading_value): ditto
(struct autoload_const_set_args): remove, redundant with autoload_const
(const_tbl_update): adjust for new internals
(autoload_const_set): ditto
(autoload_require): ditto
(autoload_reset): ditto
(rb_autoload_load): ditto
(rb_const_set): ditto
(current_autoload_data): ditto
(set_const_visibility): ditto - test/ruby/test_autoload.rb (test_autoload_same_file): new test
[ruby-core:86935] [Bug #14742]
Updated by normalperson (Eric Wong) over 6 years ago
Sorry for the delay, r63387 should fix it.
Updated by normalperson (Eric Wong) over 6 years ago
Eric Wong normalperson@yhbt.net wrote:
Sorry for the delay, r63387 should fix it.
Reverted for now. Some problems need fixing but I can't reproduce
on my 32-bit system (and my 64-bit machines have problems).
Updated by eugeneius (Eugene Kenny) over 6 years ago
It looks like the fix was un-reverted in r63392. Thank you for working on this, Eric!