Project

General

Profile

Bug #15661

Disallow concurrent Dir.chdir with block

Added by headius (Charles Nutter) over 1 year ago. Updated 4 days ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Target version:
-
[ruby-core:91807]
Tags:

Description

Dir.chdir with a block should disallow concurrent use, since it will almost never produce the results a user expects.

In https://bugs.ruby-lang.org/issues/9785 calls for Dir.chdir to be made thread-safe were rejected because the underlying native call is process-global. This is reasonable because there's no way to easily make the native chdir be thread-local (at least not without larger changes to CRuby itself).

However the block form of chdir is explicitly bounded, and clearly indicates that the dir should be changed only for the given block. I believe Dir.chdir should prevent multiple threads from attempting to do this bounded chdir at the same time.

Currently, if two threads attempt to do a Dir.chdir with a block, one of them will see a warning: "conflicting chdir during another chdir block". This warning is presumably intended to inform the user that they may see unpredictable results, but I can think of no cases where you would ever see predictable results.

In other words, there's no reason to allow a user to do concurrent Dir.chdir with a block because they will always be at risk of unpredictable results, and I believe this only leads to hard-to-diagnose bugs.

The warning should be a hard error.

The warning should also be turned into an error if a non-block Dir.chdir call happens while a block Dir.chdir is in operation. The important thing is to protect the integrity of the current directory during the lifetime of that block invocation.

In CRuby terms, the patch would be to check for chdir_blocking > 0 and then simply error out, unless it's happening on the same thread.

Concurrent non-block Dir.chdir would be unaffected. This only protects the block form from having the current dir change while it is executing.

See https://github.com/jruby/jruby/issues/5649

Updated by shevegen (Robert A. Heiler) over 1 year ago

I sort of agree with headius. I also can not really think of a possible and
deliberate use case for concurrent Dir.chdir with a block. Perhaps something
is missing (someone has a use case?) but otherwise I sort of agree with him
here.

Updated by nobu (Nobuyoshi Nakada) over 1 year ago

  • Description updated (diff)

Why not blocking until another block terminates?

Updated by headius (Charles Nutter) over 1 year ago

nobu (Nobuyoshi Nakada) I actually considered that my first preference, except for the introduction of a new global lock.

I worry about suddenly introducing deadlocks into someone's code, though that code would have to be doing some locking as well as a concurrent chdir somewhere (both locks need to be contended). Perhaps that code deserves what it got, but then again maybe it's better to just have a hard error anyway. I'm willing to discuss it more.

Updated by matz (Yukihiro Matsumoto) about 1 month ago

It's hard to estimate how big the incompatibility is. But I'd like to try it.
I hope we don't see any big problem.

Matz.

Updated by sam.saffron (Sam Saffron) about 1 month ago

I am not sure about this, we are already misleading people a lot with the block form. I wonder if the correct long term solution here is either to remove this misleading construct or do a breaking change and fix it properly.

Thread.new do
  sleep 0.5
  p `pwd` # I am in /a now
end

sleep 0.1

Dir.chdir('a') do 
  sleep 1 
  p `pwd`
end

Updated by jeremyevans0 (Jeremy Evans) 30 days ago

sam.saffron (Sam Saffron) wrote in #note-6:

I am not sure about this, we are already misleading people a lot with the block form. I wonder if the correct long term solution here is either to remove this misleading construct or do a breaking change and fix it properly.

Thread.new do
  sleep 0.5
  p `pwd` # I am in /a now
end

sleep 0.1

Dir.chdir('a') do 
  sleep 1 
  p `pwd`
end

Removing the block form breaks useful cases in single threaded code. Fixing it "properly" (by which I assume you mean separate pwd per thread) is impossible, as pwd is per-process, not per-thread.

If you want to remove the block form of Dir.chdir, please submit a separate feature request.

Updated by normalperson (Eric Wong) 30 days ago

merch-redmine@jeremyevans.net wrote:

Removing the block form breaks useful cases in single threaded
code.

Agreed.

Fixing it "properly" (by which I assume you mean
separate pwd per thread) is impossible, as pwd is per-process,
not per-thread.

openat(2) and the rest of the *at() functions are POSIX 200809,
so it could be possible to mimic per-thread working dirs if the
rest of File, IO, UNIXSocket, etc. were all made aware and used
*at() functions.

Updated by Eregon (Benoit Daloze) 29 days ago

Maybe concurrent Dir.chdir(&block) could take a global lock.
It might create unexpected contention though.
So I think the exception is better if it's not too incompatible.

Updated by sam.saffron (Sam Saffron) 29 days ago

I guess my bigger point here is that even with this fix the block form remains unsafe under concurrent use. At best this catches a few multithreading bugs. The construct is incompatible with multithreaded programming cause state leaks.

I do not object to making this "a little less terrible". But ... it remains terrible.

This fix also does nothing really for single threaded programs which are not in scope.

Updated by Eregon (Benoit Daloze) 28 days ago

It's always been like that, there is no portable and fully reliable way to make cwd thread-local.
(using *at() functions are not enough, C extensions don't use them).

I guess Dir.chdir is even more problematic with Ractor where one might expect some isolation between Ractors.

A workaround is of course to use absolute paths instead of relative paths.
I think in practice programs using relative paths tend to be single-threaded or do so early during initialization, so I think it's not a big issue, but something to be aware of.

#12

Updated by jeremyevans (Jeremy Evans) 28 days ago

  • Status changed from Open to Closed

Applied in changeset git|5d7953f86b7ae164017e2292dfb3c108e0e02527.


Switch conflicting chdir warning to RuntimeError

The documentation already stated this was an error in one case
(when it was previously a warning). Describe the other case,
where chdir without block is called inside block passed to chdir.

Fixes [Bug #15661]

Updated by byroot (Jean Boussier) 26 days ago

Quick heads up, I think this kinda broke bundler (I'll report it there as well).

Bundler (or rubygems not quite sure yet) uses Dir.chdir to build native extensions, and bundler has a --jobs options to install gems from multiple concurrent threads.

So if two native extensions are compiled concurrently, it will crash.

Gem::Ext::BuildError: ERROR: Failed to build gem native extension.

    conflicting chdir during another chdir block

Updated by byroot (Jean Boussier) 26 days ago

Ok, it was fixed a few months ago in bundler: https://github.com/rubygems/rubygems/pull/3479, it just wasn't not released yet.

Updated by jonathanhefner (Jonathan Hefner) 9 days ago

Although raising an error can be helpful for discovering buggy code, it prevents calling arbitrary third-party code from within a Dir.chdir block. This has a cascading effect. For example, a user's Rake task might use a blockless Dir.chdir, so Rake cannot invoke tasks from within a Dir.chdir block. Consequently, because Rake must use the blockless form of Dir.chdir, other code cannot call Rake (Rake::Application#run) from within a Dir.chdir block. And so on.

I would like to propose a change in how the error is raised. Instead of raising when a blockless Dir.chdir is called, raise at the end of a Dir.chdir block if the block completed successfully and Dir.pwd is different than it was at the start of the block. This would allow an "escape hatch" when calling third-party code:

Dir.chdir("some/directory") do |dir|
  call_third_party_code
  Dir.chdir(dir)
end

Updated by Dan0042 (Daniel DeLorme) 4 days ago

From the bug report, I got the idea this was about preventing a chdir while a chdir with block was active in another thread. That sounds fine to me, but this code also triggers a warning:

Dir.chdir("/usr") do
  #do things in /usr
  Dir.chdir("local") #warning: conflicting chdir during another chdir block
  #do things in /usr/local until the end of the block
end

If we're talking about turning this as well into an error I'm not sure I agree.

Also available in: Atom PDF