Project

General

Profile

Actions

Bug #17885

closed

require_relative and require should be compatible with each other when symlinked files are used

Added by john_firebaugh (John Firebaugh) over 3 years ago. Updated about 3 years ago.

Status:
Closed
Assignee:
-
Target version:
-
ruby -v:
ruby 2.6.6p146 (2020-03-31 revision 67876) [x86_64-darwin19]
[ruby-core:104010]

Description

This is similar to #10222 (which was fixed), but it's with symlinked files. #10222 was with symlinked directories.

If files are symlinked, then mixing require and require_relative can result in the same file being loaded twice.

mkdir a
echo "require_relative 'b'" > a/a.rb
echo "p 'b loaded'" > a/b.rb
mkdir b
ln -s ../a/a.rb b
ln -s ../a/b.rb b
echo "$: << File.expand_path('../b', __FILE__); require 'b'; require 'a'" > c.rb
ruby c.rb

Expected behavior is that this prints "b loaded" once. Actual behavior is that it prints "b loaded" twice.


Related issues 2 (0 open2 closed)

Related to Ruby master - Bug #18452: Dramatic performance regression in Zeitwerk with 3.1RejectedActions
Is duplicate of Ruby master - Bug #10222: require_relative and require should be compatible with each other when symlinks are usedClosedActions

Updated by john_firebaugh (John Firebaugh) over 3 years ago

This causes an issue with rules_ruby for Bazel. Bazel uses symlinks extensively to implement hermetic build sandboxing.

Updated by vo.x (Vit Ondruch) over 3 years ago

I totally support this. And there is more to this #16978

Actions #3

Updated by shyouhei (Shyouhei Urabe) over 3 years ago

  • Is duplicate of Bug #10222: require_relative and require should be compatible with each other when symlinks are used added

Updated by jeremyevans0 (Jeremy Evans) over 3 years ago

I've submitted a pull request to prevent loading the same realpath multiple times: https://github.com/ruby/ruby/pull/4615

Actions #5

Updated by jeremyevans (Jeremy Evans) about 3 years ago

  • Status changed from Open to Closed

Applied in changeset git|ddb85c5d2bdf75a83eb163856508691a7436b446.


Do not load file with same realpath twice when requiring

This fixes issues with paths being loaded twice in certain cases
when symlinks are used.

It took me multiple attempts to get this working. My original
attempt tried to convert paths to realpaths before adding them
to $LOADED_FEATURES. Unfortunately, this doesn't work well
with the loaded feature index, which is based off load paths
and not realpaths. While I was able to get require working, I'm
fairly sure the loaded feature index was not being used as
expected, which would have significant performance implications.
Additionally, I was never able to get that approach working with
autoload when autoloading a non-realpath file. It also broke
some specs.

This takes a more conservative approach. Directly before loading the
file, if the file with the same realpath has been required, the
loading of the file is skipped. The realpaths are stored as
fstrings in a hidden hash.

When rebuilding the loaded feature index, the hash of realpaths
is also rebuilt. I'm guessing this makes rebuilding process
slower, but I don think that is a hot path. In general, modifying
loaded features is only done when reloading, and that tends to be
in non-production environments.

Change test_require_with_loaded_features_pop test to use 30 threads
and 300 iterations, instead of 4 threads and 1000 iterations.
I saw only sporadic failures with 4/1000, but consistent failures
30/300 threads. These failures were due to the fact that the
concurrent deletions from $LOADED_FEATURES in other threads can
result in rb_ary_entry returning nil when rebuilding the loaded
features index.

To avoid concurrency issues when rebuilding the loaded features
index, the building of the index itself is left alone, and
afterwards, a separate loop is done on a copy of the loaded feature
snapshot in order to rebuild the realpaths hash.

Fixes [Bug #17885]

Actions #6

Updated by jeremyevans0 (Jeremy Evans) about 3 years ago

  • Status changed from Closed to Open

Reopening this since the patch was reverted.

After multiple previous attempts to fix this were shown to still have the same issue by the AWS Graviton2 gcc 11 CI (thank you @fkorotkov (Fedor Korotkov)), I was finally able to reproduce the problem locally by calling Thread.pass inside load_lock. The issue is a race-condition in ruby_verbose checking in load_lock. The simplest way to fix it is to allow silencing the warning when loading encoders and transcoders. The encoder loading code already implicitly set ruby_verbose to false before loading, and back to its original setting after loading. The transcoder was also being required, and could also trigger the warning. If the transcoder require happened after the encoder finished, the warning was generated. I think the code was always vulnerable to a race condition, it just took the use of realpath when loading to expose the race condition between the encoder loading and transcoder loading.

To fix this, I adding a warn argument to require_internal and load_lock, and I added rb_require_internal_silent, which never issues the warning. Then I had encoder and transcoder loading use rb_require_internal_silent to avoid the warning.

Revised pull request at https://github.com/ruby/ruby/pull/4887

Actions #7

Updated by jeremyevans (Jeremy Evans) about 3 years ago

  • Status changed from Open to Closed

Applied in changeset git|79a4484a072e9769b603e7b4fbdb15b1d7eccb15.


Do not load file with same realpath twice when requiring

This fixes issues with paths being loaded twice in certain cases
when symlinks are used.

It took me multiple attempts to get this working. My original
attempt tried to convert paths to realpaths before adding them
to $LOADED_FEATURES. Unfortunately, this doesn't work well
with the loaded feature index, which is based off load paths
and not realpaths. While I was able to get require working, I'm
fairly sure the loaded feature index was not being used as
expected, which would have significant performance implications.
Additionally, I was never able to get that approach working with
autoload when autoloading a non-realpath file. It also broke
some specs.

This takes a more conservative approach. Directly before loading the
file, if the file with the same realpath has been required, the
loading of the file is skipped. The realpaths are stored as
fstrings in a hidden hash.

When rebuilding the loaded feature index, the hash of realpaths
is also rebuilt. I'm guessing this makes rebuilding process
slower, but I don think that is a hot path. In general, modifying
loaded features is only done when reloading, and that tends to be
in non-production environments.

Change test_require_with_loaded_features_pop test to use 30 threads
and 300 iterations, instead of 4 threads and 1000 iterations.
I saw only sporadic failures with 4/1000, but consistent failures
30/300 threads. These failures were due to the fact that the
concurrent deletions from $LOADED_FEATURES in other threads can
result in rb_ary_entry returning nil when rebuilding the loaded
features index.

To avoid concurrency issues when rebuilding the loaded features
index, the building of the index itself is left alone, and
afterwards, a separate loop is done on a copy of the loaded feature
snapshot in order to rebuild the realpaths hash.

Fixes [Bug #17885]

Actions #8

Updated by mame (Yusuke Endoh) almost 3 years ago

  • Related to Bug #18452: Dramatic performance regression in Zeitwerk with 3.1 added
Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0Like0