Bug #920 [ruby-core:20794]

require is not thread-safe

Added by Charles Nutter 193 days ago. Updated 151 days ago.

Status :Open Start :12/23/2008
Priority :Normal Due date :
Assigned to :Nobuyoshi Nakada % Done :

0%

Category :-
Target version :-
ruby -v :

-


Description

Currently, two threads requiring the same file may see many different results. In some versions, the second thread will immediately return false, whether the require has completed or not. In other versions, the second thread will start its own load of the file in question, possibly redefining classes, modules, constants, and reinstantiating singletons. In some recent versions of 1.9, the second thread will block. We need a consistent fix for this across all production-ready Ruby versions.

If we agree that no require call should cause a given file to load twice, we have a few options:

* Modify all require implementations to immediately return false if the requested resource has already been loaded or is in the process of being loaded. This does not guarantee the file has *completed* loading, but would ensure we do not double-initialize anything in a required file. Along with this, we would simply advise all developers to avoid concurrent requires
* Modify all require implementations to use a global lock, disallowing concurrent requires entirely. This means required files must be well-behaved; they can't start up new threads that do requires or perform long-running tasks. It would fix the concurrency problem entirely and requires from multiple threads would simply happen in serial.

Both of these require behavioral changes in at least some Ruby versions, with the latter requiring the most drastic changes (since concurrent requires will no longer run in parallel, as they do now, even if no common resources are in contention).

A third option, having a per-resource lock, is infeasible due to the strong possibility of deadlocks. If two threads require separate files with circular dependencies, deadlock is almost assured. If this mechanism has been implemented in any versions of Ruby, it should be replaced with either of the two solutions above.

History

12/23/2008 03:03 AM - Dave Thomas

On Dec 22, 2008, at 10:50 AM, Charles Nutter wrote:

> * Modify all require implementations to immediately return false if  
> the requested resource has already been loaded or is in the process  
> of being loaded. This does not guarantee the file has *completed*  
> loading, but would ensure we do not double-initialize anything in a  
> required file. Along with this, we would simply advise all  
> developers to avoid concurrent requires
> * Modify all require implementations to use a global lock,  
> disallowing concurrent requires entirely. This means required files  
> must be well-behaved; they can't start up new threads that do  
> requires or perform long-running tasks. It would fix the concurrency  
> problem entirely and requires from multiple threads would simply  
> happen in serial.
>
> Both of these require behavioral changes in at least some Ruby  
> versions, with the latter requiring the most drastic changes (since  
> concurrent requires will no longer run in parallel, as they do now,  
> even if no common resources are in contention).
>
> A third option, having a per-resource lock, is infeasible due to the  
> strong possibility of deadlocks. If two threads require separate  
> files with circular dependencies, deadlock is almost assured. If  
> this mechanism has been implemented in any versions of Ruby, it  
> should be replaced with either of the two solutions above.


The first solution will not be thread safe, for the reasons stated—the  
fact that require returns does not mean that the file is actually  
loaded. This would lead to hard-to-diagnose problems.

I think the global lock is the way to go. And I find it hard to come  
up with any code this could currently break.


Dave

12/23/2008 03:50 AM - Robert Klemme

2008/12/22 Dave Thomas <dave@pragprog.com>:
>
> On Dec 22, 2008, at 10:50 AM, Charles Nutter wrote:
>
>> * Modify all require implementations to immediately return false if the
>> requested resource has already been loaded or is in the process of being
>> loaded. This does not guarantee the file has *completed* loading, but would
>> ensure we do not double-initialize anything in a required file. Along with
>> this, we would simply advise all developers to avoid concurrent requires
>> * Modify all require implementations to use a global lock, disallowing
>> concurrent requires entirely. This means required files must be
>> well-behaved; they can't start up new threads that do requires or perform
>> long-running tasks. It would fix the concurrency problem entirely and
>> requires from multiple threads would simply happen in serial.
>>
>> Both of these require behavioral changes in at least some Ruby versions,
>> with the latter requiring the most drastic changes (since concurrent
>> requires will no longer run in parallel, as they do now, even if no common
>> resources are in contention).
>>
>> A third option, having a per-resource lock, is infeasible due to the
>> strong possibility of deadlocks. If two threads require separate files with
>> circular dependencies, deadlock is almost assured. If this mechanism has
>> been implemented in any versions of Ruby, it should be replaced with either
>> of the two solutions above.
>
> The first solution will not be thread safe, for the reasons stated—the fact
> that require returns does not mean that the file is actually loaded. This
> would lead to hard-to-diagnose problems.

Absolutely, I think this is not really a solution.

> I think the global lock is the way to go. And I find it hard to come up with
> any code this could currently break.

Please see my reply (link below) where I constructed a deadlock.
Granted, you need to be able to create threads when holding the
require lock - so if that is forbidden all is well.  But the lock
alone is not enough.

http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-core/20802

Cheers

robert


-- 
remember.guy do |as, often| as.you_can - without end

12/23/2008 04:59 AM - Charles Nutter

Robert Klemme wrote:
>> The first solution will not be thread safe, for the reasons stated—the fact
>> that require returns does not mean that the file is actually loaded. This
>> would lead to hard-to-diagnose problems.
> 
> Absolutely, I think this is not really a solution.

Well, by definition it's not a solution because it doesn't actually 
solve concurrent requires. It's just the minimum we change we need to 
agree upon, and it's what I've done in JRuby so far. But it's also the 
recommended fix in the absence of a solution. It hurts when I do this, 
doctor...

>> I think the global lock is the way to go. And I find it hard to come up with
>> any code this could currently break.
> 
> Please see my reply (link below) where I constructed a deadlock.
> Granted, you need to be able to create threads when holding the
> require lock - so if that is forbidden all is well.  But the lock
> alone is not enough.
> 
> http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-core/20802

I stand by my assertion that there's no way to fix this if we allow 
required files to themselves launch threads that do additional requires. 
That needs to be verboten or the whole thing comes crashing down.

- Charlie

12/24/2008 06:16 AM - Roger Pack

I rather like the idea of only allowing "at most" one thread to be
requiring at a time.  It would break some existing code but bring
sanity and safety, and...as much as ruby dislikes surprises,
surprising thread issues should be extremely disliked, at least for
me, which this would fix.
Just my $0.02.

-=r

On Mon, Dec 22, 2008 at 9:50 AM, Charles Nutter <redmine@ruby-lang.org> wrote:
> Bug #920: require is not thread-safe
> http://redmine.ruby-lang.org/issues/show/920
>
> Author: Charles Nutter
> Status: Open, Priority: Normal
>
> Currently, two threads requiring the same file may see many different results. In some versions, the second thread will immediately return false, whether the require has completed or not. In other versions, the second thread will start its own load of the file in question, possibly redefining classes, modules, constants, and reinstantiating singletons. In some recent versions of 1.9, the second thread will block. We need a consistent fix for this across all production-ready Ruby versions.
>
> If we agree that no require call should cause a given file to load twice, we have a few options:
>
> * Modify all require implementations to immediately return false if the requested resource has already been loaded or is in the process of being loaded. This does not guarantee the file has *completed* loading, but would ensure we do not double-initialize anything in a required file. Along with this, we would simply advise all developers to avoid concurrent requires
> * Modify all require implementations to use a global lock, disallowing concurrent requires entirely. This means required files must be well-behaved; they can't start up new threads that do requires or perform long-running tasks. It would fix the concurrency problem entirely and requires from multiple threads would simply happen in serial.
>
> Both of these require behavioral changes in at least some Ruby versions, with the latter requiring the most drastic changes (since concurrent requires will no longer run in parallel, as they do now, even if no common resources are in contention).
>
> A third option, having a per-resource lock, is infeasible due to the strong possibility of deadlocks. If two threads require separate files with circular dependencies, deadlock is almost assured. If this mechanism has been implemented in any versions of Ruby, it should be replaced with either of the two solutions above.
>
>
> ----------------------------------------
> http://redmine.ruby-lang.org
>
>



-- 
2 Timothy 1:7

02/03/2009 10:30 AM - Shyouhei Urabe

  • Assigned to set to Nobuyoshi Nakada
  • ruby -v set to -

04/27/2009 06:16 AM - Tuomas Karkkainen

is circular require a feature of the ruby language?

if it was disallowed the way rake handles invoke_with_call_chain and multitask could work on this issue as well:
http://rake.rubyforge.org/classes/Rake/MultiTask.html
http://www.fsdata.se/kund/gemdocs/rake-0.8.1/rdoc/classes/Rake/Task.html#M004984

that is:
1) each require is locked by it's own mutex, 
2) each require passes along a list of all requires that is has traversed
3) a file may not require itself.

Also available in: Atom PDF