Bug #20816
openPotential regression in Ruby 3.3.x (compared with 3.1 and 3.2) regarding fast syscalls and multi-threading.
Description
Ok I'm sorry in advance if this is not really a regression (maybe it's the cost to pay for other improvements) but it still showed as a 30% performance drop in my benchmark so I think it may be worth a look.
For the full story about how I came down this rabbit hole I just wrote a blog post to avoid cluttering the issue: https://updown.io/blog/weird-results-comparing-ruby-3-1-3-2-3-3-with-jemalloc-and-yjit (please read it first)
Here I'm gonna put my last findings and the smallest reproducible example I was able to write.
So basically I noticed that quick syscalls (at least with a good SSD) like File.exists?
or File.mtime
, when used in a multi-threaded environement (Sidekiq in my initial testing but I reduced it to a few Thread.new) are significantly slower in 3.3.x (tested with the same results on 3.3.0, 3.3.2 and 3.3.5 for the moment) in comparison with 3.1.2 or 3.2.5: about twice as slow.
These syscalls are heavily used by the Rails Reloader which is how it had such a huge impact (30%) in my testing.
Here is the small reproduction script I wrote:
#!/usr/bin/env ruby
GC.disable # just to make sure it doesn't skew results
THREADS = (ARGV.first || 10).to_i # choose thread count from CLI
N = 10_000_000 # to make the test longer or shorter
puts "Ruby #{RUBY_VERSION} / YJIT: #{RubyVM::YJIT.enabled?} / #{THREADS} Threads"
Array.new(THREADS).map do
Thread.new do
(N/THREADS).times { File.mtime(__FILE__) }
end
end.each(&:join)
puts "#{N} calls completed"
The goal is just to run a lot of fast syscalls, in various number of threads, and here are the numbers on my machine:
Threads\Ruby | 3.1.2 | 3.3.5 |
---|---|---|
1 | 13.7s | ✅ 14.2s |
2 | 9.7s | ⚠️ 33.4s |
4 | 23.2s | ⚠️ 37.0s |
8 | 21.8s | ⚠️ 39.7s |
16 | 23.1s | ⚠️ 41.5s |
(more numbers in the blog post)
I suspected the new N:M thread scheduler introduced in 3.3, even though it's "disabled" by default on the main Ractor, thinking maybe these changes degraded a bit this (quite specific scenario). So I gave a quick try with the feature ENABLED, to see if it would at least change anything and it did ! (though not was I expected):
Threads\Ruby | 3.1.2 | 3.3.5 | 3.3.5 with RUBY_MN_THREADS=1 |
---|---|---|---|
1 | 13.7s | 14.2s | 14.5s ✅ |
2 | 9.7s | ⚠️ 33.4s | 12.3s 🎉 |
4 | 23.2s | ⚠️ 37.0s | 40.5s 🧐 |
8 | 21.8s | ⚠️ 39.7s | 37.9s 🧐 |
16 | 23.1s | ⚠️ 41.5s | 36.3s 🧐 |
(I can make graphs for these numbers if it helps)
Now I'm a bit stuck in my investigation, I'm not sure what change in 3.3 caused this (the new thread scheduler is a good lead IMO) or if it even is a regression. Maybe this downside is innevitable for other benefits? I am happy to contribute more time to this issue and perform more tests but I would need some guidance on what could be interesting to test, from people who understand a bit better than me how this works internally.
Thanks ! (and sorry for the long read)
Files