Bug #920 [ruby-core:20794]
require is not thread-safe
| Status : | Open | Start : | 12/23/2008 | |
| Priority : | Normal | Due date : | ||
| Assigned to : | Nobuyoshi Nakada | % Done : | 0% |
|
| Category : | - | |||
| Target version : | - | |||
| ruby -v : | - |
Description
Currently, two threads requiring the same file may see many different results. In some versions, the second thread will immediately return false, whether the require has completed or not. In other versions, the second thread will start its own load of the file in question, possibly redefining classes, modules, constants, and reinstantiating singletons. In some recent versions of 1.9, the second thread will block. We need a consistent fix for this across all production-ready Ruby versions. If we agree that no require call should cause a given file to load twice, we have a few options: * Modify all require implementations to immediately return false if the requested resource has already been loaded or is in the process of being loaded. This does not guarantee the file has *completed* loading, but would ensure we do not double-initialize anything in a required file. Along with this, we would simply advise all developers to avoid concurrent requires * Modify all require implementations to use a global lock, disallowing concurrent requires entirely. This means required files must be well-behaved; they can't start up new threads that do requires or perform long-running tasks. It would fix the concurrency problem entirely and requires from multiple threads would simply happen in serial. Both of these require behavioral changes in at least some Ruby versions, with the latter requiring the most drastic changes (since concurrent requires will no longer run in parallel, as they do now, even if no common resources are in contention). A third option, having a per-resource lock, is infeasible due to the strong possibility of deadlocks. If two threads require separate files with circular dependencies, deadlock is almost assured. If this mechanism has been implemented in any versions of Ruby, it should be replaced with either of the two solutions above.
History
12/23/2008 03:03 AM - Dave Thomas
On Dec 22, 2008, at 10:50 AM, Charles Nutter wrote: > * Modify all require implementations to immediately return false if > the requested resource has already been loaded or is in the process > of being loaded. This does not guarantee the file has *completed* > loading, but would ensure we do not double-initialize anything in a > required file. Along with this, we would simply advise all > developers to avoid concurrent requires > * Modify all require implementations to use a global lock, > disallowing concurrent requires entirely. This means required files > must be well-behaved; they can't start up new threads that do > requires or perform long-running tasks. It would fix the concurrency > problem entirely and requires from multiple threads would simply > happen in serial. > > Both of these require behavioral changes in at least some Ruby > versions, with the latter requiring the most drastic changes (since > concurrent requires will no longer run in parallel, as they do now, > even if no common resources are in contention). > > A third option, having a per-resource lock, is infeasible due to the > strong possibility of deadlocks. If two threads require separate > files with circular dependencies, deadlock is almost assured. If > this mechanism has been implemented in any versions of Ruby, it > should be replaced with either of the two solutions above. The first solution will not be thread safe, for the reasons stated—the fact that require returns does not mean that the file is actually loaded. This would lead to hard-to-diagnose problems. I think the global lock is the way to go. And I find it hard to come up with any code this could currently break. Dave
12/23/2008 03:50 AM - Robert Klemme
2008/12/22 Dave Thomas <dave@pragprog.com>: > > On Dec 22, 2008, at 10:50 AM, Charles Nutter wrote: > >> * Modify all require implementations to immediately return false if the >> requested resource has already been loaded or is in the process of being >> loaded. This does not guarantee the file has *completed* loading, but would >> ensure we do not double-initialize anything in a required file. Along with >> this, we would simply advise all developers to avoid concurrent requires >> * Modify all require implementations to use a global lock, disallowing >> concurrent requires entirely. This means required files must be >> well-behaved; they can't start up new threads that do requires or perform >> long-running tasks. It would fix the concurrency problem entirely and >> requires from multiple threads would simply happen in serial. >> >> Both of these require behavioral changes in at least some Ruby versions, >> with the latter requiring the most drastic changes (since concurrent >> requires will no longer run in parallel, as they do now, even if no common >> resources are in contention). >> >> A third option, having a per-resource lock, is infeasible due to the >> strong possibility of deadlocks. If two threads require separate files with >> circular dependencies, deadlock is almost assured. If this mechanism has >> been implemented in any versions of Ruby, it should be replaced with either >> of the two solutions above. > > The first solution will not be thread safe, for the reasons stated—the fact > that require returns does not mean that the file is actually loaded. This > would lead to hard-to-diagnose problems. Absolutely, I think this is not really a solution. > I think the global lock is the way to go. And I find it hard to come up with > any code this could currently break. Please see my reply (link below) where I constructed a deadlock. Granted, you need to be able to create threads when holding the require lock - so if that is forbidden all is well. But the lock alone is not enough. http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-core/20802 Cheers robert -- remember.guy do |as, often| as.you_can - without end
12/23/2008 04:59 AM - Charles Nutter
Robert Klemme wrote: >> The first solution will not be thread safe, for the reasons stated—the fact >> that require returns does not mean that the file is actually loaded. This >> would lead to hard-to-diagnose problems. > > Absolutely, I think this is not really a solution. Well, by definition it's not a solution because it doesn't actually solve concurrent requires. It's just the minimum we change we need to agree upon, and it's what I've done in JRuby so far. But it's also the recommended fix in the absence of a solution. It hurts when I do this, doctor... >> I think the global lock is the way to go. And I find it hard to come up with >> any code this could currently break. > > Please see my reply (link below) where I constructed a deadlock. > Granted, you need to be able to create threads when holding the > require lock - so if that is forbidden all is well. But the lock > alone is not enough. > > http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-core/20802 I stand by my assertion that there's no way to fix this if we allow required files to themselves launch threads that do additional requires. That needs to be verboten or the whole thing comes crashing down. - Charlie
12/24/2008 06:16 AM - Roger Pack
I rather like the idea of only allowing "at most" one thread to be requiring at a time. It would break some existing code but bring sanity and safety, and...as much as ruby dislikes surprises, surprising thread issues should be extremely disliked, at least for me, which this would fix. Just my $0.02. -=r On Mon, Dec 22, 2008 at 9:50 AM, Charles Nutter <redmine@ruby-lang.org> wrote: > Bug #920: require is not thread-safe > http://redmine.ruby-lang.org/issues/show/920 > > Author: Charles Nutter > Status: Open, Priority: Normal > > Currently, two threads requiring the same file may see many different results. In some versions, the second thread will immediately return false, whether the require has completed or not. In other versions, the second thread will start its own load of the file in question, possibly redefining classes, modules, constants, and reinstantiating singletons. In some recent versions of 1.9, the second thread will block. We need a consistent fix for this across all production-ready Ruby versions. > > If we agree that no require call should cause a given file to load twice, we have a few options: > > * Modify all require implementations to immediately return false if the requested resource has already been loaded or is in the process of being loaded. This does not guarantee the file has *completed* loading, but would ensure we do not double-initialize anything in a required file. Along with this, we would simply advise all developers to avoid concurrent requires > * Modify all require implementations to use a global lock, disallowing concurrent requires entirely. This means required files must be well-behaved; they can't start up new threads that do requires or perform long-running tasks. It would fix the concurrency problem entirely and requires from multiple threads would simply happen in serial. > > Both of these require behavioral changes in at least some Ruby versions, with the latter requiring the most drastic changes (since concurrent requires will no longer run in parallel, as they do now, even if no common resources are in contention). > > A third option, having a per-resource lock, is infeasible due to the strong possibility of deadlocks. If two threads require separate files with circular dependencies, deadlock is almost assured. If this mechanism has been implemented in any versions of Ruby, it should be replaced with either of the two solutions above. > > > ---------------------------------------- > http://redmine.ruby-lang.org > > -- 2 Timothy 1:7
04/27/2009 06:16 AM - Tuomas Karkkainen
is circular require a feature of the ruby language? if it was disallowed the way rake handles invoke_with_call_chain and multitask could work on this issue as well: http://rake.rubyforge.org/classes/Rake/MultiTask.html http://www.fsdata.se/kund/gemdocs/rake-0.8.1/rdoc/classes/Rake/Task.html#M004984 that is: 1) each require is locked by it's own mutex, 2) each require passes along a list of all requires that is has traversed 3) a file may not require itself.