Bug #18452
closedDramatic performance regression in Zeitwerk with 3.1
Description
Hi!
In Ruby 3.1, Zeitwerk loads implicit namespaces way slower than in previous versions. In this benchmark:
require 'fileutils'
require 'tmpdir'
require 'zeitwerk'
Dir.mktmpdir do |dir|
Dir.chdir(dir)
for i in 'a'..'z'
for j in 'a'..'z'
FileUtils.mkdir_p("lib/#{i}/#{j}")
end
end
loader = Zeitwerk::Loader.new
loader.push_dir('lib')
loader.setup
start = Process.clock_gettime(Process::CLOCK_MONOTONIC)
loader.eager_load
puts Process.clock_gettime(Process::CLOCK_MONOTONIC) - start
end
you go from about 0.06s to +11s. That is like a 180x slow down .
I have seen that the single cause of the slow down is one line: Zeitwerk's Kernel#require
wrapper pushing to $LOADED_FEATURES
when it intercepts a directory source code. For context, Zeitwerk sets autoloads for directories that define namespaces, so that their corresponding modules get autovivified when first referenced by the user.
I have written a self-contained script that reproduces the root of the issue without Zeitwerk, in case that helps debugging:
module Kernel
alias original_require require
def require(path)
if path.start_with?('/foo')
$LOADED_FEATURES << path
true
else
original_require(path)
end
end
end
foos = []
('a'..'z').each do |i|
('a'..'z').each do |j|
foos << "/foo#{i}#{j}"
end
end
start = Process.clock_gettime(Process::CLOCK_MONOTONIC)
foos.each { |foo| require foo }
puts Process.clock_gettime(Process::CLOCK_MONOTONIC) - start
I have seen that this commit is responsible for the majority of the regression, But it is not the only cause, before that commit, the script is faster, but not as fast by an important factor too.
Updated by jeremyevans0 (Jeremy Evans) almost 3 years ago
The bug fixed in 79a4484a072e9769b603e7b4fbdb15b1d7eccb15 is always going to result in some slow down, since Ruby now has to determine the realpath for all entries in $LOADED_FEATURES
. The commit doesn't change when the loaded feature index is rebuilt, though it makes the rebuilding process slower as the realpath is recalculated for all entries in $LOADED_FEATURES
.
$LOADED_FEATURES
is supposed to have files that ruby has required, it's not supposed to contain directories. So any code that modifies $LOADED_FEATURES
to add a directory is doing something wrong, in my opinion. Is there a reason Zeitwerk is adding directories to $LOADED_FEATURES
, instead of using its own data structure? I assume if you are overriding require
, you can check your own data structure instead of abusing $LOADED_FEATURES
.
FWIW, I tried the reproducer without Zeitwerk, and it performed about the same in both 3.0 and 3.1:
$ ruby30 -v t/t33.rb
ruby 3.0.3p157 (2021-11-24 revision 3fb7d2cadc) [x86_64-openbsd]
0.00035451585426926613
$ ruby31 -v t/t33.rb
ruby 3.1.0p0 (2021-12-25 revision fb4df44d16) [x86_64-openbsd]
0.00035237614065408707
I'm not sure why the code in your non-Zeitwerk example would perform differently, as the index rebuild only happens when get_loaded_features_index
is called (which is called by Kernel#require
in ruby code) is called, it's not called just by modifying $LOADED_FEATURES
, and your non-Zeitwerk example doesn't call Kernel#require
, it only modifies $LOADED_FEATURES
.
If I modify the script slightly, to add a require 'set'
to the code being timed, then I do see a difference, it is a little less than 3x slower:
$ ruby30 -v t/t33.rb
ruby 3.0.3p157 (2021-11-24 revision 3fb7d2cadc) [x86_64-openbsd]
0.005338589195162058
$ ruby31 -v t/t33.rb
ruby 3.1.0p0 (2021-12-25 revision fb4df44d16) [x86_64-openbsd]
0.01448665326461196
Considering it has to determine the realpath on over 65536 entries in $LOADED_FEATURES
, that seems reasonable. My only guess as to what is happening with Zeitwerk is Zeitwerk is requiring an loaded feature index rebuild every time it is modifying $LOADED_FEATURES
, resulting in quadratic behavior.
Updated by fxn (Xavier Noria) almost 3 years ago
Is there a reason Zeitwerk is adding directories to $LOADED_FEATURES, instead of using its own data structure?
It didn't do that since it first shipped. The lib intercepted the require
and did its thing without invoking the original method, and without storing the argument in $LOADED_FEATURES
. It hijacked the call and nobody noticed. This has worked fine since forever.
But last week I got a report about some namespaces not loading. The reason for that was that the project loaded Zeitwerk first, and Bootsnap later. Both decorate require
. Bootsnap assumes a successful require
puts its argument in $LOADED_FEATURES
and caches the directory internally. And that prevented reload, because $LOADED_FEATURES
did not have the directory to clean it up, the cache of Bootsnap became out of sync, and Bootsnap returned false
for subsequent require
calls right away, Zeitwerk's was not even called.
I thought about different options, and believed that Bootsnap made a reasonable assumption, so just a few days ago I published a new version that pushed the directory to $LOADED_FEATURES
.
You are right about directories being unexpected in that collection, see my comment in the original issue.
You are also right about the self-contained script. I missed some scientific notation and misread some numbers. These are the figures:
BEFORE 79a4484a07
% ./ruby -v ~/tmp/foo.rb
ruby 3.1.0dev (2021-10-02T14:51:29Z v3_1_0_preview1~1463 3f7da458a7) [x86_64-darwin21]
/Users/fxn/prj/zeitwerk/lib/zeitwerk/kernel.rb:24: warning: method redefined; discarding old require
Zeitwerk: 0.28963999999905354
Self-contained: 0.000389000000723172
AFTER 79a4484a07
% ./ruby -v ~/tmp/foo.rb
ruby 3.1.0dev (2021-10-02T14:51:29Z v3_1_0_preview1~1462 79a4484a07) [x86_64-darwin21]
/Users/fxn/prj/zeitwerk/lib/zeitwerk/kernel.rb:24: warning: method redefined; discarding old require
Zeitwerk: 11.44435700000031
Self-contained: 9.200000022246968e-05
As you see, the reported impact happens in the script using Zeitwerk.
Every require
call pushes to $LOADED_FEATURES
, and if I comment out the line that pushes to $LOADED_FEATURES
in Zeitwerk performance is not affected. Why is the lib's push
more costly?
Updated by fxn (Xavier Noria) almost 3 years ago
BTW, Zeitwerk does not call the original require
either. But it is a fact that if you comment out the push
call, the regression disappears.
Updated by mame (Yusuke Endoh) almost 3 years ago
- Related to Bug #17885: require_relative and require should be compatible with each other when symlinked files are used added
Updated by jeremyevans0 (Jeremy Evans) almost 3 years ago
fxn (Xavier Noria) wrote in #note-2:
I thought about different options, and believed that Bootsnap made a reasonable assumption, so just a few days ago I published a new version that pushed the directory to
$LOADED_FEATURES
.You are right about directories being unexpected in that collection, see my comment in the original issue.
I think your original assumption is more reasonable, and Bootsnap's assumption is not. I'm also not sure what code is expecting a Kernel#require
with a directory to work, but such code also seems suspect, IMO. If you've only made this change to Zeitwerk in the last week, I recommend reverting it.
Every
require
call pushes to$LOADED_FEATURES
, and if I comment out the line that pushes to$LOADED_FEATURES
in Zeitwerk performance is not affected. Why is the lib'spush
more costly?
When you modify $LOADED_FEATURES
directly, you trigger a rebuild of the loaded feature index the next time Kernel#require
(or related C-API method) is called. This has been the case for many years, it's just that index rebuilds are more expensive now due to realpath calculations. In general, it's best to never modify $LOADED_FEATURES
unless you must (such as to remove entries to allow for code reloading).
Updated by fxn (Xavier Noria) almost 3 years ago
I think your original assumption is more reasonable, and Bootsnap's assumption is not.
OK, thanks for the hint.
I'm also not sure what code is expecting a Kernel#require with a directory to work, but such code also seems suspect, IMO. If you've only made this change to Zeitwerk in the last week, I recommend reverting it.
In projects managed by Zeitwerk, subdirectories represent namespaces. If the namespace has a file defining it, all is normal, but if there is no explicit file, the library is expected to create a dummy module object and assign it to the corresponding constant.
Now, there's a general expectation in Zeitwerk that says that things are loaded on demand, when referenced. In order to do so and be symmetric with everything else, Zeitwerk defines an autoload on the directory. That is, if you have
app/controllers/admin
and no admin.rb
anywhere, it executes autoload :Admin, "full/path/to/app/controllers/admin"
. Then, the require
wrapper intercepts that call, autovivifies, does housekeeping, and does not call the original method. The autoload is hijacked.
And this has worked flawlessly since it first shipped about two years ago. Pushing to $LOADED_FEATURES
is something new that has been out only for a few days.
Thanks for the explanation on the index being rebuilt!
I think we can close Jeremy, I'll follow your advice and revert. Thanks for your help.
Updated by jeremyevans0 (Jeremy Evans) almost 3 years ago
- Status changed from Open to Rejected