Bug #6836

Improve File.expand_path performance in Windows

Added by luislavena (Luis Lavena) almost 8 years ago. Updated over 7 years ago.

Target version:
ruby -v:
ruby 2.0.0dev (2012-08-04 trunk 36616) [i386-mingw32]


(original write up in markdown here:

== Background

While investigating the performance issues of (({File.expand_path})) on Windows,
Usaku Nakamura and Nobuyoshi Nakada on [ruby-core:39504] pointed out that
due security concerns, accessing files on Windows required normalized paths.

This was covered in the security update of March 2008, WEBrick file-access
vulnerability [1].

After closer inspection of WEBrick code (mentioned by the security update),
I noticed it uses (({File.expand_path})) to perform the heavy lifting of path
normalization in the request.

The code around this can be inspected in (({prevent_directory_traversal}))[2]
and (({shift_path_info}))[3] methods.

This approach performs a hit into the filesystem, contrary to its
implementation in any other operating system.

(({File.expand_path})) is heavily used by (({require})), which result in slow
application startup, depending on the application size or number of gems it
depends on.

Stepping back for a second, we can see that the security issue is around
WEBrick and the way it determines (({path_info})) absoluteness.

It is also clear that to solve WEBrick security issue, a tax has been applied
to the entire Ruby ecosystem, penalizing startup performance.

With Hiroshi Shirosaki's help, we worked on a patch that:

  • Limit filesystem hit only to WEBrick, using Windows' GetLongPathName [4].
  • Use a Windows-specific API to normalize paths
  • Improve encoding support.

What started as an experiment named Fenix [5] has shown great results on a
variety of systems.

This patch has been integrated in TheCodeShop [6] releases of Ruby 1.9.3 and
tested by Ruby-Core developers Hiroshi Shirosaki and myself.

== Performance

To demonstrate the benefits of this patch, I've used measurements [7] project
and both (({core_require_empty})) and (({core_require_nested})) workloads,
obtaining the following results:

ruby 2.0.0dev (2012-08-03 trunk 36611) [i386-mingw32]
Rehearsal ------------------------------------------------------
core_require_empty 1.186000 3.385000 4.571000 ( 4.676267)
--------------------------------------------- total: 4.571000sec

                      user     system      total        real

core_require_empty 1.217000 3.385000 4.602000 ( 4.643266)

Rehearsal -------------------------------------------------------
core_require_nested 1.514000 3.760000 5.274000 ( 5.305303)
---------------------------------------------- total: 5.274000sec

                       user     system      total        real

core_require_nested 1.466000 3.713000 5.179000 ( 5.233300)

And with patch applied:

ruby 2.0.0dev (2012-08-03 trunk 36611) [i386-mingw32]
Rehearsal ------------------------------------------------------
core_require_empty 0.765000 1.077000 1.842000 ( 1.887603)
--------------------------------------------- total: 1.842000sec

                      user     system      total        real

core_require_empty 0.717000 1.123000 1.840000 ( 1.887603)

Rehearsal -------------------------------------------------------
core_require_nested 0.717000 1.670000 2.387000 ( 2.480405)
---------------------------------------------- total: 2.387000sec

                       user     system      total        real

core_require_nested 0.890000 1.528000 2.418000 ( 2.496004)

Benchmarks were performed all on the same hardware and OS:

  • CPU: Core 2 Duo T7500 @ 2.20Ghz
  • RAM: 4GB
  • HDD: 1.5GB RAMdisk (ImDisk)
  • OS: Windows 7 Ultimate x64

All tests associated (both File and WEBrick ones) pass.

Additional tests that exercise specific aspects of new function were added.

Patch has been tested also against:

  • Visual Studio build of Ruby
  • Ubuntu 12.04
  • Mac OSX

And the patch didn't affect either build or tests of it.

=== Real life impact: Rails

The biggest Ruby project affected by this is Rails applications.

An empty Rails application on startup requires more than 700 files from
different gems:

V:\emptyapp>ruby script\rails runner -e production "p $LOADED_FEATURES.size"

When benchmark startup using w32time [8]:

V:\emptyapp>timer ruby script\rails runner -e production "0"
real 7.285
system 4.539
user 2.683

And patched Ruby:

V:\emptyapp>timer ruby script\rails runner -e production "0"
real 2.620
system 0.873
user 1.700

(best result taken from several warm ups).

Now, a mid-sized application like Enki [9] which loads 1146 files in
production mode, result in:

V:\enki>timer ruby script\rails runner -e production "p $LOADED_FEATURES.size"
real 22.620
system 11.497
user 11.076

Almost ((23 seconds)), compared to patched version:

V:\enki>timer ruby script\rails runner -e production "0"
real 11.013
system 1.981
user 8.938

This change also improves performance of (({rake})) inside Rails, from:

V:\enki>timer rake -T
real 8.689
system 0.015
user 0.000


V:\enki>timer rake -T
real 3.307
system 0.000
user 0.031

Making normal operations more accessible.

Looking forward for your thoughts on these changes.

Thank you.



Also available in: Atom PDF