Project

General

Profile

Actions

Feature #16375

closed

Right size regular expression compile buffers for literal regexes and on Regexp#freeze

Added by methodmissing (Lourens Naudé) over 4 years ago. Updated over 3 years ago.

Status:
Closed
Target version:
[ruby-core:95968]

Description

References PR https://github.com/ruby/ruby/pull/2696

As a continuation of type specific resize on freeze implementations of String and Array and looking into the Regexp type I found these memory access patterns for regular expression literals:

==22079== -------------------- 12 of 500 --------------------
==22079== max-live:    1,946,560 in 4,345 blocks
==22079== tot-alloc:   1,946,560 in 4,345 blocks (avg size 448.00)
==22079== deaths:      none (none of these blocks were freed)
==22079== acc-ratios:  1.36 rd, 0.98 wr  (2,651,994 b-read, 1,908,158 b-written)
==22079==    at 0x4C2DECF: malloc (in /usr/lib/valgrind/vgpreload_exp-dhat-amd64-linux.so)
==22079==    by 0x24C496: onig_new_with_source (re.c:844)
==22079==    by 0x24C496: make_regexp (re.c:874)
==22079==    by 0x24C496: rb_reg_initialize (re.c:2858)
==22079==    by 0x24C496: rb_reg_initialize_str (re.c:2892)
==22079==    by 0x24C496: rb_reg_compile (re.c:2982)
==22079==    by 0x12EB84: rb_parser_reg_compile (parse.y:12185)
==22079==    by 0x12EB84: parser_reg_compile (parse.y:12179)
==22079==    by 0x12EB84: reg_compile (parse.y:12195)
==22079==    by 0x2147E3: new_regexp (parse.y:10101)
==22079==    by 0x2147E3: ruby_yyparse (parse.y:4419)
==22079==    by 0x2161F7: yycompile0 (parse.y:5942)
==22079==    by 0x3241FF: rb_suppress_tracing (vm_trace.c:427)
==22079==    by 0x1FDBF6: yycompile (parse.y:5991)
==22079==    by 0x1FDBF6: rb_parser_compile_file_path (parse.y:6130)
==22079==    by 0x27AC96: load_file_internal (ruby.c:2034)
==22079==    by 0x137730: rb_ensure (eval.c:1129)
==22079==    by 0x27CEEA: load_file (ruby.c:2153)
==22079==    by 0x27CEEA: rb_parser_load_file (ruby.c:2175)
==22079==    by 0x1954CE: load_iseq_eval (load.c:587)
==22079==    by 0x1954CE: rb_load_internal (load.c:651)
==22079==    by 0x1954CE: rb_f_load (load.c:709)
==22079==    by 0x2FB957: vm_call_cfunc_with_frame (vm_insnhelper.c:2468)
==22079==    by 0x2FB957: vm_call_cfunc (vm_insnhelper.c:2493)

Digging a little further and remembering some context of previous oniguruma memory investigation I remembered the pattern buffer struct has a compile buffer with a simple watermark for tracking used space. This changeset implements reg_resize (static as ary_resize) which attempts to right size the compile buffer if over allocated at the following sites:

  • After compiling a literal regular expression.
  • Implement an explicit type specific rb_reg_freeze and point Regexp#compile to it
  • I also follow the chain member which points to another regex_t on the struct if present, but have not been able to find references to it in the source tree other than for freeing a regex or inspecting it's memory footprint.

I introduced 2 new debug counters, which yields the following results on booting Redmine on Rails 5:

[RUBY_DEBUG_COUNTER]    obj_regexp_lit_extracapa                    6319
[RUBY_DEBUG_COUNTER]    obj_regexp_lit_extracapa_bytes            301685

About 300kb reallocated across 6319 oversized instances.

An example of Regexp#freeze

irb(main):007:0> r = Regexp.compile("(?!%\h\h|[!$-&(-;=?-_a-~]).")
irb(main):008:0> ObjectSpace.memsize_of(r)
=> 588
irb(main):009:0> r.freeze
=> /(?!%hh|[!$-&(-;=?-_a-~])./
irb(main):010:0> ObjectSpace.memsize_of(r)
=> 543

There is likely more layers that can be peeled back here, but keeping it simple and concise for review.

I think it's possible to get towards a state where 5 to 10MB RSS can be shaved off a standard Rails process by:

  • Being careful with buffer defaults for objects that have auxiliary buffers
  • Identifying hooks where excess allocation can be reduced to current watermark if the object buffer does not need to grow anymore (literal object, frozen object etc.)
  • A GC hook on promotion to OLD object to trim excessive capacity, which I'd consider as a special kind of garbage, as outlined in https://bugs.ruby-lang.org/issues/15402

Would any further incremental work towards this be considered worthwhile from a ruby-core and community perspective?

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0