Project

General

Profile

Actions

Bug #19246

closed

Rebuilding the loaded feature index much slower in Ruby 3.1

Added by thomthom (Thomas Thomassen) almost 2 years ago. Updated 10 months ago.

Status:
Closed
Assignee:
-
Target version:
-
[ruby-core:111342]

Description

Some background to this issue: (This is a case that is unconventional usage of Ruby, but I hope you bear with me.)

We ship the Ruby interpreter with our desktop applications for plugin support in our application (SketchUp).

One feature we have had since, at least 2006 (maybe earlier-hard to track history beyond that) is that we had a custom alternate require method: Sketchup.require. This allows the users of our API to load encrypted Ruby files.

This originally used rb_provide to add the path to the encrypted file into the list of loaded feature. However, somewhere between Ruby 2.2 and 2.5 there was some string optimisations made and the function rb_provide would not use a copy of the string passed to it. Instead it just held on to a pointer reference. In our case that string came from user-land, being passed in from Sketchup.require and would eventually be garbage collected and cause access violation crashes.

To work around that we changed our custom Sketchup.require to push to $LOADED_FEATURES directly. There was a small penalty to the index being rebuilt after that, but it was negligible.

Recently we tried to upgrade the Ruby interpreter in our application from 2.7 to 3.1 and found a major performance reduction when using our `Sketchup.require. As in, a plugin that would load in half a second would now spend 30 seconds.

From https://bugs.ruby-lang.org/issues/18452 it sounds like there is some expected extra penalty due to changes in how the index is built. But should it really be this much?

Example minimal repro to simulate the issue:

# frozen_string_literal: true
require 'benchmark'

iterations = 200

foo_files = iterations.times.map { |i| "#{__dir__}/tmp/foo-#{i}.rb" }
foo_files.each { |f| File.write(f, "") }

bar_files = iterations.times.map { |i| "#{__dir__}/tmp/bar-#{i}.rb" }
bar_files.each { |f| File.write(f, "") }

biz_files = iterations.times.map { |i| "#{__dir__}/tmp/biz-#{i}.rb" }
biz_files.each { |f| File.write(f, "") }

Benchmark.bm do |x|
  x.report('normal') {
    foo_files.each { |file|
      require file
    }
  }
  x.report('loaded_features') {
    foo_files.each { |file|
      require file
      $LOADED_FEATURES << "#{file}-fake.rb"
    }
  }
  x.report('normal again') {
    biz_files.each { |file|
      require file
    }
  }
end
C:\Users\Thomas\SourceTree\ruby-perf>ruby27.bat
ruby 2.7.4p191 (2021-07-07 revision a21a3b7d23) [x64-mingw32]

C:\Users\Thomas\SourceTree\ruby-perf>ruby test-require.rb
       user     system      total        real
normal  0.000000   0.031000   0.031000 (  0.078483)
loaded_features  0.015000   0.000000   0.015000 (  0.038759)
normal again  0.016000   0.032000   0.048000 (  0.076940)
C:\Users\Thomas\SourceTree\ruby-perf>ruby30.bat
ruby 2.7.4p191 (2021-07-07 revision a21a3b7d23) [x64-mingw32]

C:\Users\Thomas\SourceTree\ruby-perf>ruby test-require.rb
       user     system      total        real
normal  0.000000   0.031000   0.031000 (  0.074733)
loaded_features  0.032000   0.000000   0.032000 (  0.038898)
normal again  0.000000   0.047000   0.047000 (  0.076343)
C:\Users\Thomas\SourceTree\ruby-perf>ruby31.bat
ruby 3.1.2p20 (2022-04-12 revision 4491bb740a) [x64-mingw-ucrt]

C:\Users\Thomas\SourceTree\ruby-perf>ruby test-require.rb
       user     system      total        real
normal  0.016000   0.031000   0.047000 (  0.132633)
loaded_features  1.969000  11.500000  13.469000 ( 18.395761)
normal again  0.031000   0.125000   0.156000 (  0.249130)

Right now we're exploring options to deal with this. Because the performance degradation is a blocker for us upgrading. We also have 16 years of plugins created by third party developer that makes it impossible for us to drop this feature.

Some options as-is, none of which are ideal:

  1. We revert to using rb_provide but ensure the string passed in is not owned by Ruby, instead building a list of strings that we keep around for the duration of the application process. The problem is that some of our plugin developers have on occasion released plugins that will touch $LOADED_FEATURES, and if such a plugin is installed on a user machine it might cause the application to become unresponsive for minutes. The other non-ideal issue with using rb_provide is that we're also using that in ways it wasn't really intended (from that I understand). And it's not an official API?

  2. We create a separate way for our Sketchup.require to keep track of it's loaded features, but then that would diverge even more from the behaviour of require. Replicating require functionality is not trivial and would be prone to subtle errors and possible diverge. It also doesn't address our issue that there is code out there in existing plugins that touches $LOADED_FEATURES. (And it's not something we can just ask people to clean up. From previous experience old versions stick around for a long time and is very hard to purge from circulation.)

I have two questions for the Ruby mantainers:

  1. Would it be reasonable to see an API for adding/removing/checking $LOADED_FEATURE that would allow for a more ideal implementation of a custom require functionality?

  2. Is the performance difference in rebuilding the loaded feature index really expected to be as high as what we're seeing? An increase of nearly 100 times? Is there something there that might be addressed to make the rebuild to be less expensive against? (This would really help to address our challenges with third party plugins occasionally touching the global.)

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like1Like1Like1Like1Like1Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like1Like1Like0Like0