Misc #17565
openPrefer use of access(2) in rb_file_load_ok() to check for existence of require'd files
Description
When using Ruby in Docker (2.5 in our case, but the code is unchanged in 15 years across all versions) with a large $LOAD_PATH some millions of calls are made to open(2)
with a mean cost of 130µsec per call, where a call to access(2)
has a cost around 5× lower (something around 28µsec).
With a Rails 5 app, without Zeitwerk, the load path is searched iteratively looking for a file to define a constant, this causes something like 2,000,000 calls to open(2)
of which 97.5% are failing with ENOENT
.
I believe that the cost of two syscalls (open(2)
only after successful access(2)
) would, in our case, at least because we would shave-off something like 1,900,000×90µsec (2.85 minutes) from the three minute boot time for our application.
I prepared a very naïve patch with a simple early-return in rb_file_load_ok
:
diff --git a/file.c b/file.c
index 3bf092c05c..c7a7635125 100644
--- a/file.c
+++ b/file.c
@@ -5986,6 +5986,16 @@ rb_file_load_ok(const char *path)
O_NDELAY |
#endif
0);
+ if (access(path, R_OK) == -1) return 0;
int fd = rb_cloexec_open(path, mode, 0);
if (fd == -1) return 0;
rb_update_max_fd(fd);
This hasn't been exhaustively tested as I simply haven't had time yet, but at least it compiled and passed make check
.
I spoke with Aaron Patterson on Twitter, who suggested maybe a wiser approach would be a heuristic approach one level higher (rb_find_file
?) which switches the strategy based on the length of the LOAD_PATH.
Alternatively, maybe the patch could be conditional, guarded somehow, and conditionally compiled only into the Rubies built for Docker, in a way that is portable to the common Ruby version managers.
I am opening this ticket to track my own work, as much as anything, with no expectation that someone implement this on my behalf. I am eager to contribute to Ruby for all the benefit I have seen from it in my career.
If someone knows hints why this may be an unsuccessful adventure, I gratefully receive any and all feedback.