Excruciatingly slow pathname implementation
In writing a pure-Ruby git implementation, I discovered with rprof that using Pathname was the source of a huge proportion of my library's running time. Recently, Rails contributor José Valim discovered a similar situation in Rails. Anecdotally, removing Pathname resulted in a speedup from 6s to 3s on a single request. Here's his commit removing some Pathname usage from Rails:
I have rewritten Pathname, and the project is on GitHub.
I've included the rewrite as a patch. It's backwards-compatible with the existing Pathname class, and passes RubySpec and MRI tests.
Updated by mame (Yusuke Endoh) over 8 years ago
- Assignee set to akr (Akira Tanaka)
- Target version set to 2.0.0
This is not a bug but feature request. This ticket is moved to Feature
Your patch might be good, but apparently too big. I guess it will take
long time to review it.
I recommend you to explain essensial changes for performance improvement,
and to split the patch for each change. And, do not remove documents
Yusuke Endoh email@example.com
Updated by stouset (Stephen Touset) over 8 years ago
The patch would be very difficult to do change by change, partly because the entire class is rewritten to descend from String. Suffice it to say, virtually all of the methods in the original Pathname library tried to implement logic themselves, sometimes in a very inefficient manner, rather than falling back upon the well-optimized implementations in File. This rewrite is akin to the switch from CSV to FasterCSV, which I don't believe was submitted as a series of patches. As the code passes all of the related Ruby specs, I wouldn't think this would take too much effort to review and merge in.
The documentation was different because I rewrote the class without the anticipation of having it become part of standard Ruby. I can change the code to mimic the original comments in Pathname without too much effort, if it would help get the patch accepted.
Updated by shyouhei (Shyouhei Urabe) over 8 years ago
No, your understanding is wrong.
C:\Users\shyouhei\Documents>C:\ruby\bin\ruby.exe -rpathname -e"p Pathname('C:\').relative_path_from Pathname('D:\')"
C:/ruby/lib/ruby/1.8/pathname.rb:723:in `relative_path_from': different prefix:
"C:\" and "D:\" (ArgumentError)
C:\Users\shyouhei\Documents>C:\ruby\bin\ruby.exe -rpathname3 -e"p Pathname('C:\').relative_path_from Pathname('D:\')"