Project

General

Profile

Actions

Feature #10185

closed

[PATCH] iseq: free untranslated iseq->iseq at compile

Added by normalperson (Eric Wong) almost 10 years ago. Updated almost 10 years ago.

Status:
Closed
Target version:
[ruby-core:64663]

Description

running "ruby -rpp -e 'pp GC.stat'", a reduction in
malloc usage is shown:

before:
:malloc_increase=>118784,
:oldmalloc_increase=>1178736,

after:
:malloc_increase=>99832,
:oldmalloc_increase=>1031976,

(numbers from x86-64)

  • compile.c (rb_iseq_translate_threaded_code): free iseq->iseq
    (rb_vm_addr2insn): new function for debug
    (rb_iseq_untranslate_threaded_code): ditto

  • iseq.c (iseq_memsize): support iseq->iseq == 0
    (rb_iseq_disasm): untranslate for dump
    (iseq_data_to_ary): ditto
    (rb_iseq_line_trace_each): ditto

  • vm_dump.c (rb_vmdebug_debug_print_pre): ditto

I think we can remove iseq->iseq field from the rb_iseq_t structure in
the future and use transient allocation for the debug functions.
rb_iseq_untranslate_threaded_code is an uncommon code path for debug and
not a performance issue in normal code.


Files

iseq-tc-diet.patch (4.43 KB) iseq-tc-diet.patch normalperson (Eric Wong), 08/29/2014 09:24 PM
iseq-iseq-diet.patch (9.54 KB) iseq-iseq-diet.patch normalperson (Eric Wong), 08/29/2014 11:10 PM
0001-iseq-translate-iseq-inplace.patch (10.1 KB) 0001-iseq-translate-iseq-inplace.patch normalperson (Eric Wong), 09/09/2014 08:21 AM

Updated by normalperson (Eric Wong) almost 10 years ago

Patch #2:

Subject: [PATCH] iseq (rb_iseq_t): remove iseq->iseq field

It is unnecessary and reduces transient malloc/free during compile.

For "ruby -e exit", valgrind reports over 300K reduction in
overall allocations.

before:
total heap usage: 49,622 allocs, 20,492 frees, 8,697,493 bytes allocated

after:
total heap usage: 48,935 allocs, 19,805 frees, 8,373,773 bytes allocated

Updated by normalperson (Eric Wong) almost 10 years ago

https://bugs.ruby-lang.org/issues/10185#change-48562

---Files--------------------------------
iseq-tc-diet.patch (4.43 KB)
iseq-iseq-diet.patch (9.54 KB)

Ping? I hope to commit these two soon.

Updated by ko1 (Koichi Sasada) almost 10 years ago

Thank you for waiting my comment.

I'm thinking about compatibility and expandability.

Now, in Ruby source, iseq->iseq is used only for disasm. So that your patch works completely.
However we have several compilers (iseq -> C) to use iseq->iseq directly.

Of course, most of people don't use this compiler.
So your proopsal is reasonable.

On the other hand, the name `iseq_encoded' is not only for direct threaded code. We can make another encoding scheme.


Summary: my proposal is:

  • remain iseq->iseq (NULL as default)
  • rename rb_iseq_untranslate_threaded_code to rb_iseq_decode_encoded_iseq() and fill iseq->iseq if iseq->iseq is not NULL.
  • rb_iseq_decode_encoded_iseq() can call rb_iseq_untranslate_threaded_code() (one of decode scheme).
  • expose (not in ruby/ but internal.h and expose symbol) rb_iseq_decode_encoded_iseq() for maniac users.

Updated by ko1 (Koichi Sasada) almost 10 years ago

Koichi Sasada wrote:

  • rename rb_iseq_untranslate_threaded_code to rb_iseq_decode_encoded_iseq() and fill iseq->iseq if iseq->iseq is not NULL.

decode is also strange.

const VALUE *rb_iseq_orignal_iseq() ?

Updated by normalperson (Eric Wong) almost 10 years ago

This combines both patches to do translation in-place.

I couldn't make rb_iseq_original_iseq return const because of
rb_iseq_line_trace_each, but it now caches iseq->iseq field
(at the bottom of rb_iseq_t, away from hotter fields).

Also, I hope to make compile_data and iseq->iseq fields a union,
so it might be iseq->u1.compile_dat/iseq->u1.iseq in the future.

Actions #6

Updated by Anonymous almost 10 years ago

  • Status changed from Open to Closed
  • % Done changed from 0 to 100

Applied in changeset r47508.


compile: translate iseq in-place

running "ruby -rpp -e 'pp GC.stat'", a reduction in
malloc usage is shown:

before:
:malloc_increase=>118784,
:oldmalloc_increase=>1178736,

after:
:malloc_increase=>99832,
:oldmalloc_increase=>1031976,

For "ruby -e exit", valgrind reports over 300K reduction in
overall allocations (and unnecessary memory copies).

before:
total heap usage: 49,622 allocs, 20,492 frees, 8,697,493 bytes allocated
after:
total heap usage: 48,935 allocs, 19,805 frees, 8,373,773 bytes allocated

(numbers from x86-64)

v2 changes based on ko1 recommendations [ruby-core:64883]:

  • squashed in-place direct thread translation to avoid alloc+copy
  • renamed rb_iseq_untranslate_threaded_code to rb_iseq_original_iseq,
    cache new iseq->iseq_original field.
  • compile.c (rb_iseq_translate_threaded_code): modify in-place w/o copy
    (rb_vm_addr2insn): new function for debug
    (rb_iseq_original_iseq): ditto
    (iseq_set_sequence): assign iseq_encoded directly
    [Feature #10185]

  • vm_core (rb_iseq_t): move original ->iseq to bottom

  • iseq.c (iseq_free, iseq_free): adjust for new layout
    (rb_iseq_disasm): use original iseq for dump
    (iseq_data_to_ary): ditto
    (rb_iseq_line_trace_each): ditto
    (rb_iseq_build_for_ruby2cext): use iseq_encoded directly

  • vm_dump.c (rb_vmdebug_debug_print_pre): use original iseq

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0