Bug #19378
openWindows: Use less syscalls for faster require of big gems
Description
Hello 🙂
Problem¶
require is slow on windows for big gems. (example: require 'gtk3'=> 3 seconds+). This is a problem for people who want to make cross platform GUI apps with ruby.
Possible Reason¶
As touched on in #15797 it seems like require uses realpath, which is emulated on windows. It checks every parent directory. The same syscalls run many times.
Testfile¶
C:\tmp\speedtest\testrequire.rb:
require __dir__ + "/helloworld1.rb"
require __dir__ + "/helloworld2.rb"
ruby --disable-gems C:\tmp\speedtest\testrequire.rb
Syscalls per File/Directory:¶
- CreateFile
- QueryInformationVolume
- QueryIdInformation
- QueryAllInformationFile
- QueryNameInformationFile
- QueryNameInformationFile
- QueryNormalizedNameInformationFile
- CloseFile
Files/Directories checked¶
- C:\tmp
- C:\tmp\speedtest
- C:\tmp\speedtest\helloworld1.rb
- C:\tmp
- C:\tmp\speedtest
- C:\tmp\speedtest\helloworld2.rb
For two required files Ruby had to do 8*6 = 48 syscalls.
The syscalls orginate from rb_w32_reparse_symlink_p / lstat
Rubygems live in subfolders with 9+ parts: "C:\Ruby32-x64\lib\ruby\gems\3.2.0\gems\glib2-4.0.8\lib\glib2\variant.rb"
Each file takes 8 * 9 = 72+ calls. For variant.rb it is 80 calls.
The result for the syscalls don't change in such a short time, so it should be possible to cache it.
With require_relative it's twice as many calls.
Other testcases¶
Same result:
File.realpath __dir__ + "/helloworld1.rb"
File.realpath __dir__ + "/helloworld2.rb"
File.stat __dir__ + "/helloworld1.rb"
File.stat __dir__ + "/helloworld2.rb"
It does not happen in $LOAD_PATH.resolve_feature_path(dir + "/helloworld1.rb")
Request¶
Would it be possible to cache the stat calls when using require?
I tried to implement a cache inside the ruby source code, but failed.
If not, is there now a way to combine ruby files into one?
I previously talked about require here: YJIT: Windows support lacking.
How to reproduce¶
Ruby versions: At least 3.0+, most likely older ones too.
Tested using Ruby Installer 3.1 and 3.2.
Procmon Software by Sysinternals
Files
Updated by aidog (Andi Idogawa) almost 2 years ago
Thanks to the new windows build docs by ioquatix, I made a test patch to check how much faster it would be if some of the repeated syscalls on the folders (c:/tmp/, c:/tmp/speedtest, gems and so on) are avoided:
tzinfo: 0.8s to 0.3s
gtk3: 2.8s to 2.5s (I see another similar issue inside the gem C code)
Windows has GetFinalPathNameByHandleW since Vista, which some other projects use for realpath. Would it work for Ruby?
Updated by nobu (Nobuyoshi Nakada) almost 2 years ago
- Status changed from Open to Assigned
- Assignee set to windows
Updated by joshc (Josh C) over 1 year ago
I've also noticed a significant increase in file IO events (as reported by procmon) due to https://github.com/ruby/ruby/commit/79a4484a072e9769b603e7b4fbdb15b1d7eccb15 introduced in Ruby 3.1.0. The code tries to prevent the same file from being loaded twice by calling rb_realpath_internal
to see if the realpath has already been loaded. This is a problem on systems like Windows that use Ruby's emulated realpath, especially when there are deeply nested directories. I've attached a revert patch. It'd be great to use GetFinalPathNameByHandleW and avoid the emulate code.
Updated by jeremyevans0 (Jeremy Evans) over 1 year ago
joshc (Josh C) wrote in #note-3:
I've attached a revert patch.
I think the only way we would revert 79a4484a072e9769b603e7b4fbdb15b1d7eccb15 is if someone can come up with an alternative approach to fixing Bug #17885.
It'd be great to use GetFinalPathNameByHandleW and avoid the emulate code.
If you mean to use this on Windows for the internals of File#realpath, I think we would be open to a backwards compatible patch for that, but @usa (Usaku NAKAMURA) would need to decide as he maintains the mswin64 platform.
Updated by MSP-Greg (Greg L) over 1 year ago
Code using GetFinalPathNameByHandleW
already exists in win32/win32.c, see
https://github.com/ruby/ruby/blob/c43fbe4ebd2b519601f0b90ca98fa096799d3846/win32/win32.c#L2013-L2022
For cross-reference, see also Bug #19246 'Rebuilding the loaded feature index much slower in Ruby 3.1'
Updated by MSP-Greg (Greg L) over 1 year ago
Just to be clear, this issue affects all Windows MRI platforms, so both mswin64 and mingw32 (mingw & ucrt builds) are affected.