Project

General

Profile

Actions

Bug #9934

closed

High memory usage from file_expand_path_*

Added by tmm1 (Aman Karmani) over 10 years ago. Updated about 10 years ago.

Status:
Closed
Assignee:
-
Target version:
ruby -v:
current
[ruby-core:63114]

Description

All the file expansion routines use EXPAND_PATH_BUFFER() which allocates PATH_MAX bytes on the heap per invocation.
The strings returned by File.expand_path are never realloc'd after they are populated, so they continue using 4kb (on linux) per string.
In our rails app, 22MB of heap usage is due to expanded path name strings.

$ ruby -robjspace -e' puts ObjectSpace.dump(File.expand_path("/foo")) '
{"address":"0x007fa2b44dd6c8", "type":"STRING", "class":"0x007fa2b3f99608", "bytesize":4, "capacity":4098, "value":"/foo", "encoding":"US-ASCII", "memsize":4099, "flags":{"wb_protected":true}}

The following failing patch demonstrates the issue as well:

diff --git a/test/ruby/test_file_exhaustive.rb b/test/ruby/test_file_exhaustive.rb
index 2c945ea..49be9de 100644
--- a/test/ruby/test_file_exhaustive.rb
+++ b/test/ruby/test_file_exhaustive.rb
@@ -458,6 +458,12 @@ class TestFileExhaustive < Test::Unit::TestCase
     end
   end

+  def test_expand_path_memsize
+    require "objspace"
+    path = File.expand_path("/foo")
+    assert_equal 5, ObjectSpace.memsize_of(path)
+  end
+
   def test_expand_path_encoding
     drive = (DRIVE ? 'C:' : '')
     if Encoding.find("filesystem") == Encoding::CP1251

Updated by tmm1 (Aman Karmani) over 10 years ago

This is the best I could come up with. Definitely not ideal since it allocates another ruby object.

diff --git a/file.c b/file.c
index 77facac..71e2d93 100644
--- a/file.c
+++ b/file.c
@@ -3391,6 +3391,9 @@ rb_file_expand_path_internal(VALUE fname, VALUE dname, int abs_mode, int long_na
     rb_str_set_len(result, p - buf);
     rb_enc_check(fname, result);
     ENC_CODERANGE_CLEAR(result);
+    if (OBJ_TAINTED(result)) tainted = 1;
+    result = rb_enc_str_new(RSTRING_PTR(result), p - buf, rb_enc_get(result));
+    if (tainted) OBJ_TAINT(result);
     return result;
 }
 #endif /* _WIN32 */
diff --git a/test/ruby/test_file_exhaustive.rb b/test/ruby/test_file_exhaustive.rb
index 2c945ea..ed43ec0 100644
--- a/test/ruby/test_file_exhaustive.rb
+++ b/test/ruby/test_file_exhaustive.rb
@@ -458,6 +458,12 @@ class TestFileExhaustive < Test::Unit::TestCase
     end
   end
 
+  def test_expand_path_memsize
+    require "objspace"
+    path = File.expand_path("/foo")
+    assert_operator ObjectSpace.memsize_of(path), :<=, 5
+  end
+
   def test_expand_path_encoding
     drive = (DRIVE ? 'C:' : '')
     if Encoding.find("filesystem") == Encoding::CP1251

Updated by normalperson (Eric Wong) over 10 years ago

wrote:

This is the best I could come up with. Definitely not ideal since it
allocates another ruby object.

What about using rb_str_resize?

rb_str_resize may work even better if we remove the 1024 byte threshold
and trust the realloc implementation to do the right thing.

Additionally, rb_str_freeze may also be smarter and force a resize
unconditionally, as keeping the buffer is useless for frozen strings.
Not sure if changing rb_str_freeze this way breaks compatibility
(but we should be able to safely resize for rb_fstring, at least).

Updated by nobu (Nobuyoshi Nakada) over 10 years ago

  • Status changed from Open to Closed
  • % Done changed from 0 to 100

Applied in changeset r46410.


file.c: shrink expanded path

  • file.c (expand_path): shrink expanded path which no longer needs
    rooms to append. [ruby-core:63114] [Bug #9934]

Updated by dbussink (Dirkjan Bussink) over 10 years ago

This does actually not fix the bug. This only works if a path expanded by File.expand_path is smaller than the string embedded length. If a string is longer than that length, it isn't made smaller and it still uses up too much memory.

Actions #5

Updated by nobu (Nobuyoshi Nakada) over 10 years ago

I think r46413 has fixed it.

Updated by dbussink (Dirkjan Bussink) over 10 years ago

Thanks for the fix!

Updated by tmm1 (Aman Karmani) over 10 years ago

  • Tracker changed from Bug to Backport
  • Project changed from Ruby master to Backport21
  • Status changed from Closed to Open

Updated by nobu (Nobuyoshi Nakada) over 10 years ago

  • Tracker changed from Backport to Bug
  • Project changed from Backport21 to Ruby master
  • Description updated (diff)
  • Category set to core
  • Status changed from Open to Closed
  • Target version set to 2.2.0
  • ruby -v set to current
  • Backport set to 2.0.0: REQUIRED, 2.1: REQUIRED

Don't move projects, update Backport instead.

Updated by nagachika (Tomoyuki Chikanaga) over 10 years ago

  • Backport changed from 2.0.0: REQUIRED, 2.1: REQUIRED to 2.0.0: REQUIRED, 2.1: DONE

r46408, r46410, r46413, r46414, r46424, r46436, r46437 and partially r44827 (to resolve conflict of r46408) were backported into ruby_2_1 branch at r47215.

Updated by usa (Usaku NAKAMURA) about 10 years ago

  • Backport changed from 2.0.0: REQUIRED, 2.1: DONE to 2.0.0: DONE, 2.1: DONE

backported into ruby_2_0_0 at r47399.

note: ruby 2.0.0 doesn't recognize the length of the teminator of the string's encoding.

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0