Project

General

Profile

Actions

Bug #19969

closed

Regression of memory usage with Ruby 3.1

Added by hsbt (Hiroshi SHIBATA) about 1 year ago. Updated about 1 year ago.

Status:
Closed
Assignee:
-
Target version:
-
[ruby-core:115139]

Description

Our company that is ANDPAD, Inc. encountered to increase memory usage after upgrading Ruby 3.2 from 3.0 on our Rails application. This increase size is about 20%.

My colleague found this root cause and reproduction code:

$ ruby -v -rset -e 's1 = Set.new(10000.times); s2 = Set.new(9999.times); Array.new(10000) { s1 - s2 - [0] }; puts `ps -o rss= -p #{$$}`.to_i'
ruby 3.0.6p216 (2023-06-29 revision bdfe1958a8) +JIT [arm64-darwin22]
248096

$ ruby -v -rset -e 's1 = Set.new(10000.times); s2 = Set.new(9999.times); Array.new(10000) { s1 - s2 - [0] }; puts `ps -o rss= -p #{$$}`.to_i'
ruby 3.2.2 (2023-07-05 revision 2f603bc4d7) +YJIT [arm64-darwin22]
2949280

Should we revert #16996 for Ruby 3.1 or later? I'm not sure this increased memory usage is reasonable with performance improvement.

Actions #2

Updated by nobu (Nobuyoshi Nakada) about 1 year ago

  • Backport changed from 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN to 3.0: DONTNEED, 3.1: REQUIRED, 3.2: REQUIRED

Updated by Eregon (Benoit Daloze) about 1 year ago

Right, @nobu's approach seems much better than reintroducing that weird behavior for .dup.

Ideally we wouldn't rehash as in calling key.hash methods again, but instead just shrink the internal data structure (and same when growing it).

Updated by Eregon (Benoit Daloze) about 1 year ago

So apparently some applications were relying on Set#dup/Hash#dup to do like C++ shrink_to_fit.
Ruby does not have such a method and it feels quite low-level, so it seems better to resize the internal data structure when removing elements/entries and going below some threshold.

Updated by Eregon (Benoit Daloze) about 1 year ago

As a note, this repro code is very "lucky" to trigger a dup after removing 99.99% of the elements.
I suppose it's done that way to make the effect very clear though.
Without the - [0] the same problem occurs on 3.0:

$ ruby -v -rset -e 's1 = Set.new(10000.times); s2 = Set.new(9999.times); a=Array.new(10000) { s1 - s2 }; GC.start; puts `ps -o rss= -p #{$$}`.to_i' 
ruby 3.0.6p216 (2023-03-30 revision 23a532679b) [x86_64-linux]
3015808
$ ruby -v -rset -e 's1 = Set.new(10000.times); s2 = Set.new(9999.times); a=Array.new(10000) { s1 - s2 - [0] }; GC.start; puts `ps -o rss= -p #{$$}`.to_i'
ruby 3.0.6p216 (2023-03-30 revision 23a532679b) [x86_64-linux]
74552

If a Set is kept alive a long time, one way to ensure it uses the minimum amount of space is Set#reset, at the cost of extra time to reset/rehash (which notably calls #hash for every key), it's a time vs memory trade-off, can be worth it for big long-lived sets:

$ ruby -v -rset -e 's1 = Set.new(10000.times); s2 = Set.new(9999.times); a=Array.new(10000) { s=s1 - s2 - [0]; s.reset; s }; GC.start; puts `ps -o rss= -p #{$$}`.to_i'
ruby 3.2.2 (2023-03-30 revision e51014f9c0) [x86_64-linux]
62992

Automatic shrinking (PR at https://github.com/ruby/ruby/pull/8748) should help the worst cases like the repro so that seems good anyway.

Actions #6

Updated by nobu (Nobuyoshi Nakada) about 1 year ago

  • Status changed from Open to Closed

Applied in changeset git|9eac9d71786a8dbec520d0541a91149f01adf8ea.


[Bug #19969] Compact st_table after deleted if possible

Updated by nagachika (Tomoyuki Chikanaga) about 1 year ago

  • Backport changed from 3.0: DONTNEED, 3.1: REQUIRED, 3.2: REQUIRED to 3.0: DONTNEED, 3.1: REQUIRED, 3.2: DONE

ruby_3_2 1cc38d5a2f84733e1c2e42548639e2891fe61e69 merged revision(s) 9eac9d71786a8dbec520d0541a91149f01adf8ea.

Updated by hsbt (Hiroshi SHIBATA) about 1 year ago

Thanks nobu and nagachika.

I confirmed to resolve this regrassion with ruby_3_2 branch.

# Before
$ ruby -v -rset -e 's1 = Set.new(10000.times); s2 = Set.new(9999.times); Array.new(10000) { s1 - s2 - [0] }; puts `ps -o rss= -p #{$$}`.to_i'
ruby 3.2.2 (2023-07-05 revision 2f603bc4d7) +YJIT [arm64-darwin23]
4564304

# After
$ ruby -v -rset -e 's1 = Set.new(10000.times); s2 = Set.new(9999.times); Array.new(10000) { s1 - s2 - [0] }; puts `ps -o rss= -p #{$$}`.to_i'
ruby 3.2.2 (2023-11-19 revision d9f4f321c6) +YJIT [arm64-darwin23]
40864

Updated by usa (Usaku NAKAMURA) about 1 year ago

  • Backport changed from 3.0: DONTNEED, 3.1: REQUIRED, 3.2: DONE to 3.0: DONTNEED, 3.1: DONE, 3.2: DONE

ruby_3_1 1cae5e7ceaca7304108fdec35d4858a9e4ff7fe0 merged revision(s) 9eac9d71786a8dbec520d0541a91149f01adf8ea.

Actions

Also available in: Atom PDF

Like1
Like0Like0Like0Like0Like0Like0Like0Like0Like0