Bug #19246
closedRebuilding the loaded feature index much slower in Ruby 3.1
Description
Some background to this issue: (This is a case that is unconventional usage of Ruby, but I hope you bear with me.)
We ship the Ruby interpreter with our desktop applications for plugin support in our application (SketchUp).
One feature we have had since, at least 2006 (maybe earlier-hard to track history beyond that) is that we had a custom alternate require
method: Sketchup.require
. This allows the users of our API to load encrypted Ruby files.
This originally used rb_provide
to add the path to the encrypted file into the list of loaded feature. However, somewhere between Ruby 2.2 and 2.5 there was some string optimisations made and the function rb_provide
would not use a copy of the string passed to it. Instead it just held on to a pointer reference. In our case that string came from user-land, being passed in from Sketchup.require
and would eventually be garbage collected and cause access violation crashes.
To work around that we changed our custom Sketchup.require
to push to $LOADED_FEATURES
directly. There was a small penalty to the index being rebuilt after that, but it was negligible.
Recently we tried to upgrade the Ruby interpreter in our application from 2.7 to 3.1 and found a major performance reduction when using our `Sketchup.require. As in, a plugin that would load in half a second would now spend 30 seconds.
From https://bugs.ruby-lang.org/issues/18452 it sounds like there is some expected extra penalty due to changes in how the index is built. But should it really be this much?
Example minimal repro to simulate the issue:
# frozen_string_literal: true
require 'benchmark'
iterations = 200
foo_files = iterations.times.map { |i| "#{__dir__}/tmp/foo-#{i}.rb" }
foo_files.each { |f| File.write(f, "") }
bar_files = iterations.times.map { |i| "#{__dir__}/tmp/bar-#{i}.rb" }
bar_files.each { |f| File.write(f, "") }
biz_files = iterations.times.map { |i| "#{__dir__}/tmp/biz-#{i}.rb" }
biz_files.each { |f| File.write(f, "") }
Benchmark.bm do |x|
x.report('normal') {
foo_files.each { |file|
require file
}
}
x.report('loaded_features') {
foo_files.each { |file|
require file
$LOADED_FEATURES << "#{file}-fake.rb"
}
}
x.report('normal again') {
biz_files.each { |file|
require file
}
}
end
C:\Users\Thomas\SourceTree\ruby-perf>ruby27.bat
ruby 2.7.4p191 (2021-07-07 revision a21a3b7d23) [x64-mingw32]
C:\Users\Thomas\SourceTree\ruby-perf>ruby test-require.rb
user system total real
normal 0.000000 0.031000 0.031000 ( 0.078483)
loaded_features 0.015000 0.000000 0.015000 ( 0.038759)
normal again 0.016000 0.032000 0.048000 ( 0.076940)
C:\Users\Thomas\SourceTree\ruby-perf>ruby30.bat
ruby 2.7.4p191 (2021-07-07 revision a21a3b7d23) [x64-mingw32]
C:\Users\Thomas\SourceTree\ruby-perf>ruby test-require.rb
user system total real
normal 0.000000 0.031000 0.031000 ( 0.074733)
loaded_features 0.032000 0.000000 0.032000 ( 0.038898)
normal again 0.000000 0.047000 0.047000 ( 0.076343)
C:\Users\Thomas\SourceTree\ruby-perf>ruby31.bat
ruby 3.1.2p20 (2022-04-12 revision 4491bb740a) [x64-mingw-ucrt]
C:\Users\Thomas\SourceTree\ruby-perf>ruby test-require.rb
user system total real
normal 0.016000 0.031000 0.047000 ( 0.132633)
loaded_features 1.969000 11.500000 13.469000 ( 18.395761)
normal again 0.031000 0.125000 0.156000 ( 0.249130)
Right now we're exploring options to deal with this. Because the performance degradation is a blocker for us upgrading. We also have 16 years of plugins created by third party developer that makes it impossible for us to drop this feature.
Some options as-is, none of which are ideal:
-
We revert to using
rb_provide
but ensure the string passed in is not owned by Ruby, instead building a list of strings that we keep around for the duration of the application process. The problem is that some of our plugin developers have on occasion released plugins that will touch$LOADED_FEATURES
, and if such a plugin is installed on a user machine it might cause the application to become unresponsive for minutes. The other non-ideal issue with usingrb_provide
is that we're also using that in ways it wasn't really intended (from that I understand). And it's not an official API? -
We create a separate way for our
Sketchup.require
to keep track of it's loaded features, but then that would diverge even more from the behaviour ofrequire
. Replicatingrequire
functionality is not trivial and would be prone to subtle errors and possible diverge. It also doesn't address our issue that there is code out there in existing plugins that touches$LOADED_FEATURES
. (And it's not something we can just ask people to clean up. From previous experience old versions stick around for a long time and is very hard to purge from circulation.)
I have two questions for the Ruby mantainers:
-
Would it be reasonable to see an API for adding/removing/checking
$LOADED_FEATURE
that would allow for a more ideal implementation of a customrequire
functionality? -
Is the performance difference in rebuilding the loaded feature index really expected to be as high as what we're seeing? An increase of nearly 100 times? Is there something there that might be addressed to make the rebuild to be less expensive against? (This would really help to address our challenges with third party plugins occasionally touching the global.)
Updated by thomthom (Thomas Thomassen) almost 2 years ago
Additional note: For development purposes there are also need to remove from $LOADED_FEATURES
in order to reload files without reloading the application. The application being a desktop application makes it not ideal to restart it every time you make a source code change while developing a plugin. In this case is performance impact is also a major issue as you'd constantly be hit with extremely slow reloads.
Updated by thomthom (Thomas Thomassen) almost 2 years ago
One more thing (Sorry for sounding like Colombo), we also remove items from $LOADED_FEATURES
when plugins are updated. Because the application is long running and we have a plugin manager that lets the user install new versions we remove all entries in $LOADED_FEATURES
that belong to the plugin being updated in order to allow the update to take effect immediately. At the moment there is no other means than manipulating the global directly. And in 3.1 the performance impact is too great to be put in front of users. (Something like rb_unprovide
would be nice.)
Updated by MSP-Greg (Greg L) over 1 year ago
This does seem like a considerable slowdown on Windows.
The above code uses absolute paths with require
, and also adds nonexistent files to $LOADED_FEATURES
. I changed the code so that all requires are simply a File.basename
,
and the items pushed to $LOADED_FEATURES
are real files. Below is data for several Ruby versions, run with both Windows and WSL2/Ubuntu 22.04.
The 'normal' & normal again' benches each require 200 files. The 'loaded features' bench requires 200 files, and additionally adds 200 different file names to $LOADED_FEATURES
.
Adding items to $LOADED_FEATURES
has worse performance with Ruby 3.1 and later on both Windows & Ubuntu, but the Windows degradation is orders of magnitude larger.
Issue 19378 'Windows: Use less syscalls for faster require of big gems' seems to involve the same problem.
Windows
user system total real
ruby 2.7.7p221 (2022-11-24 revision 168ec2b1e5) [x64-mingw32]
normal 0.016000 0.047000 0.063000 ( 0.103496)
loaded features 0.031000 0.031000 0.062000 ( 0.157406)
normal again 0.000000 0.094000 0.094000 ( 0.100160)
ruby 3.0.5p211 (2022-11-24 revision ba5cf0f7c5) [x64-mingw32]
normal 0.000000 0.063000 0.063000 ( 0.100240)
loaded features 0.032000 0.031000 0.063000 ( 0.156637)
normal again 0.000000 0.062000 0.062000 ( 0.098520)
ruby 3.1.0p0 (2021-12-25 revision fb4df44d16) [x64-mingw-ucrt]
normal 0.000000 0.062000 0.062000 ( 0.174335)
loaded features 3.141000 18.860000 22.001000 ( 33.106847)
normal again 0.015000 0.265000 0.280000 ( 0.400028)
ruby 3.1.3p185 (2022-11-24 revision 1a6b16756e) [x64-mingw-ucrt]
normal 0.000000 0.157000 0.157000 ( 0.168443)
loaded features 3.516000 18.547000 22.063000 ( 32.569175)
normal again 0.047000 0.218000 0.265000 ( 0.390879)
ruby 3.3.0dev (2023-02-27T15:23:31Z master ea830ab29d) [x64-mswin64_140]
normal 0.015000 0.094000 0.109000 ( 0.175751)
loaded features 3.235000 16.640000 19.875000 ( 31.294871)
normal again 0.031000 0.203000 0.234000 ( 0.386469)
user system total real
require 800 0.015000 0.235000 0.250000 ( 0.391236) ruby 2.7.7p221
require 800 0.078000 0.156000 0.234000 ( 0.391100) ruby 3.0.5p211
require 800 0.031000 0.484000 0.515000 ( 0.664927) ruby 3.1.0p0
require 800 0.109000 0.391000 0.500000 ( 0.651035) ruby 3.1.3p185
require 800 0.062000 0.406000 0.468000 ( 0.704620) ruby 3.3.0dev
Note: Ruby master timing is similar for mingw, ucrt, & mswin builds
WSL2/Ubuntu
user system total real
ruby 2.7.7p221 (2022-11-24 revision 168ec2b1e5) [x86_64-linux]
normal 0.001399 0.002799 0.004198 ( 0.004197)
loaded features 0.065028 0.000000 0.065028 ( 0.065073)
normal again 0.004481 0.000000 0.004481 ( 0.004482)
ruby 3.0.4p208 (2022-04-12 revision 3fa771dded) [x86_64-linux]
normal 0.004395 0.000000 0.004395 ( 0.004394)
loaded features 0.065669 0.000000 0.065669 ( 0.065717)
normal again 0.004371 0.000000 0.004371 ( 0.004371)
ruby 3.1.0p0 (2021-12-25 revision fb4df44d16) [x86_64-linux]
normal 0.006511 0.000000 0.006511 ( 0.006505)
loaded features 0.189599 0.138964 0.328563 ( 0.328602)
normal again 0.004792 0.002796 0.007588 ( 0.007588)
ruby 3.1.3p185 (2022-11-24 revision 1a6b16756e) [x86_64-linux]
normal 0.004290 0.001430 0.005720 ( 0.005719)
loaded features 0.098113 0.210443 0.308556 ( 0.308597)
normal again 0.006900 0.000000 0.006900 ( 0.006899)
ruby 3.3.0dev (2023-02-27T21:34:27Z master ba55706fb6) [x86_64-linux]
normal 0.003708 0.001236 0.004944 ( 0.004942)
loaded features 0.118627 0.160026 0.278653 ( 0.278693)
normal again 0.000000 0.006377 0.006377 ( 0.006378)
user system total real
require 800 0.017006 0.000000 0.017006 ( 0.017004) ruby 2.7.7p221
require 800 0.016817 0.000000 0.016817 ( 0.016819) ruby 3.0.4p208
require 800 0.023295 0.000000 0.023295 ( 0.023295) ruby 3.1.0p0
require 800 0.020107 0.000000 0.020107 ( 0.020106) ruby 3.1.3p185
require 800 0.009945 0.009533 0.019478 ( 0.019476) ruby 3.3.0dev
Updated by jeremyevans0 (Jeremy Evans) over 1 year ago
Since @matz (Yukihiro Matsumoto) asked for an update and I won't be able to attend the dev meeting, my understanding is that rebuilding the loaded feature index is much slower in Ruby 3.1+, because realpaths are calculated for every loaded feature every time the loaded feature index is rebuilt. This was done to fix #17885.
I didn't consider this a major issue at the time, because in general, the loaded feature index does not need to be rebuilt unless $LOADED_FEATURES
is mutated, and it shouldn't be during normal use. Also, the slowdown is not that dramatic on platforms supporting realpath
function. It's only very bad on platforms that are also emulating realpath
(realpath
emulation is quite slow).
One possible optimization I haven't tried yet is using the previous vm->loaded_features_realpaths
hash when rebuilding the loaded feature index. Basically, first try looking at the previous hash for the realpath, and if it is found there, assume the realpath has not changed. If the realpath doesn't exist in the previous hash, then try to calculate it. I'll see if I have time today to test that.
Updated by jeremyevans0 (Jeremy Evans) over 1 year ago
jeremyevans0 (Jeremy Evans) wrote in #note-4:
One possible optimization I haven't tried yet is using the previous
vm->loaded_features_realpaths
hash when rebuilding the loaded feature index. Basically, first try looking at the previous hash for the realpath, and if it is found there, assume the realpath has not changed. If the realpath doesn't exist in the previous hash, then try to calculate it. I'll see if I have time today to test that.
I wasn't able to use this approach, since the hash needs to map loaded feature paths to realpaths (and vm->loaded_features_realpaths
does not do that), but I was able to add a new hash that does that, and significantly improves performance even when realpath
is not emulated. I submitted a pull request for it: https://github.com/ruby/ruby/pull/7699
Updated by jeremyevans (Jeremy Evans) over 1 year ago
- Status changed from Open to Closed
Applied in changeset git|1f115f141dd17f75049a5e17107906c5bcc372e1.
Speed up rebuilding the loaded feature index
Rebuilding the loaded feature index slowed down with the bug fix
for #17885 in 79a4484a072e9769b603e7b4fbdb15b1d7eccb15. The
slowdown was extreme if realpath emulation was used, but even when
not emulated, it could be about 10x slower.
This adds loaded_features_realpath_map to rb_vm_struct. This is a
hidden hash mapping loaded feature paths to realpaths. When
rebuilding the loaded feature index, look at this hash to get
cached realpath values, and skip calling rb_check_realpath if a
cached value is found.
Fixes [Bug #19246]
Updated by MSP-Greg (Greg L) over 1 year ago
Thank you for working on this. Refreshed data:
ruby 2.7.8p225 (2023-03-30 revision 1f4d455848) [x64-mingw32]
user system total real
normal 0.015000 0.078000 0.093000 ( 0.103699)
loaded features 0.078000 0.078000 0.156000 ( 0.157629)
normal again 0.016000 0.063000 0.079000 ( 0.100165)
ruby 3.0.6p216 (2023-03-30 revision 23a532679b) [x64-mingw32]
user system total real
normal 0.031000 0.047000 0.078000 ( 0.102176)
loaded features 0.078000 0.062000 0.140000 ( 0.155943)
normal again 0.000000 0.094000 0.094000 ( 0.098305)
ruby 3.1.4p223 (2023-03-30 revision 957bb7cb81) [x64-mingw-ucrt]
user system total real
normal 0.016000 0.125000 0.141000 ( 0.167011)
loaded features 4.672000 24.125000 28.797000 ( 31.891813)
normal again 0.093000 0.266000 0.359000 ( 0.369579)
ruby 3.2.2 (2023-03-30 revision e51014f9c0) [x64-mingw-ucrt]
user system total real
normal 0.047000 0.110000 0.157000 ( 0.173098)
loaded features 5.141000 23.828000 28.969000 ( 31.793334)
normal again 0.062000 0.297000 0.359000 ( 0.369032)
ruby 3.3.0dev (2023-04-14T08:07:33Z master 6c1b604e29) [x64-mingw-ucrt]
user system total real
normal 0.000000 0.141000 0.141000 ( 0.162872)
loaded features 0.078000 0.203000 0.281000 ( 0.298789)
normal again 0.031000 0.125000 0.156000 ( 0.164640)
Ruby master is approx two orders of magnitude faster than Ruby 3.1 & 3.2. A bit slower than Ruby 2.7 and 3.0, but well within acceptable results.
I assume this will be backported to Ruby 3.1 & 3.2?
Thanks again.
Greg
Updated by jeremyevans0 (Jeremy Evans) over 1 year ago
- Backport changed from 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN to 3.0: DONTNEED, 3.1: REQUIRED, 3.2: REQUIRED
I've updated the backport flags for this issue. However, as always, it is up to the branch maintainer whether or not to backport.
Updated by thomthom (Thomas Thomassen) over 1 year ago
Thank you very much for looking into this. We'll be testing the patch to see how the performance is with our application in our, somewhat special use case.
I got a follow up question: As mentioned, the reason why we modify $LOADED_FEATURES
is because we provide our own require
method for encrypted source code. But the current API rb_provide
isn't ideal for this scenario. Partly because the string passed in needs to always live. The other is that for our reloading functionality we need to remove items from this array. If we could use C API methods designed to add/remove to this array without needing to rebuild the cache that'd give us some very nice performance boosts. And make our logic more robust.
Is this something that could be considered? (Maybe best if I log a separate ticket for this topic?)
Updated by jeremyevans0 (Jeremy Evans) over 1 year ago
thomthom (Thomas Thomassen) wrote in #note-9:
Is this something that could be considered? (Maybe best if I log a separate ticket for this topic?)
It would be best to log a separate issue stating what your use case is, and what you proposed feature is for fixing it. Even better if you can come up with a proposed patch.
Updated by byroot (Jean Boussier) over 1 year ago
But the current API rb_provide isn't ideal for this scenario. Partly because the string passed in needs to always live.
For what it's worth, I just checked the current version of rb_provide
and it uses rb_fstring_cstr()
which does end up copying the char *
. I'm not sure which version exactly caused you issues, but I think you could go back to use rb_provide
again.
Updated by MSP-Greg (Greg L) over 1 year ago
Could this get backported to 3.1 & 3.2?
See:
https://github.com/ruby/ruby/pull/7699
https://github.com/ruby/ruby/commit/61b974629e3571
Updated by nagachika (Tomoyuki Chikanaga) over 1 year ago
- Backport changed from 3.0: DONTNEED, 3.1: REQUIRED, 3.2: REQUIRED to 3.0: DONTNEED, 3.1: REQUIRED, 3.2: DONE
Thank you for creating a backport patch and pull request. Merged into ruby_3_2.
Updated by nagachika (Tomoyuki Chikanaga) over 1 year ago
- Backport changed from 3.0: DONTNEED, 3.1: REQUIRED, 3.2: DONE to 3.0: DONTNEED, 3.1: REQUIRED, 3.2: REQUIRED
I have reverted the backport at https://github.com/ruby/ruby/commit/1c7624469880bcb964be09a49e4907873f45b026.
It seems break CI on MinGW.
https://github.com/ruby/ruby/actions/runs/5698775895/job/15447455988
Updated by MSP-Greg (Greg L) about 1 year ago
@nagachika (Tomoyuki Chikanaga)
Thanks for your work on this. Please see https://github.com/ruby/ruby/pull/8252
Updated by nagachika (Tomoyuki Chikanaga) about 1 year ago
@MSP-Greg (Greg L) Thank you for investigations and providing patches! I will handle it soon.
Updated by nagachika (Tomoyuki Chikanaga) about 1 year ago
- Backport changed from 3.0: DONTNEED, 3.1: REQUIRED, 3.2: REQUIRED to 3.0: DONTNEED, 3.1: REQUIRED, 3.2: DONE
Updated by usa (Usaku NAKAMURA) about 1 year ago
- Backport changed from 3.0: DONTNEED, 3.1: REQUIRED, 3.2: DONE to 3.0: DONTNEED, 3.1: DONE, 3.2: DONE
ruby_3_1 8a8d889ca2ba1ace8784e49e02a6616dd5765d9e merged revision(s) 1f115f141dd17f75049a5e17107906c5bcc372e1.
Updated by nagachika (Tomoyuki Chikanaga) 12 months ago
- Backport changed from 3.0: DONTNEED, 3.1: DONE, 3.2: DONE to 3.0: DONTNEED, 3.1: DONE, 3.2: REQUIRED
I will fill Backport field with "3.2: REQUIRED" according to the comments in https://github.com/ruby/ruby/pull/8275. I will retry to backport to ruby_3_2.
Updated by nagachika (Tomoyuki Chikanaga) 12 months ago
- Backport changed from 3.0: DONTNEED, 3.1: DONE, 3.2: REQUIRED to 3.0: DONTNEED, 3.1: DONE, 3.2: DONE
ruby_3_2 5baf94f9131fb45d50c8c408e007a138ced46606 merged revision(s) 1f115f141dd17f75049a5e17107906c5bcc372e1.
Updated by thomthom (Thomas Thomassen) 10 months ago
byroot (Jean Boussier) wrote in #note-11:
But the current API rb_provide isn't ideal for this scenario. Partly because the string passed in needs to always live.
For what it's worth, I just checked the current version of
rb_provide
and it usesrb_fstring_cstr()
which does end up copying thechar *
. I'm not sure which version exactly caused you issues, but I think you could go back to userb_provide
again.
The comment for rb_fstring_cstr
:
rb_fstring_new and rb_fstring_cstr family create or lookup a frozen
* shared string which refers a static string literal. `ptr` must
* point a constant string.
https://github.com/ruby/ruby/blob/c68ce6f7f5e6020409cfe689ba5946f284828d4b/string.c#L476-L486
That doesn't sound like it's doing a copy. It's expecting a static string literal. Which is why we ran into issue when we were providing non-static strings.
Updated by byroot (Jean Boussier) 10 months ago
That doesn't sound like it's doing a copy.
My mad, you are right, I missed the copy
argument is passed as false.
Honestly I think we could change that. Feature names are generally small, it wouldn't be a noticeable performance degradation to assume the char *
isn't constant and to always copy it.