Project

General

Profile

Actions

Feature #10238

open

todo: remove dependency on malloc_usable_size

Added by normalperson (Eric Wong) about 10 years ago. Updated 8 months ago.

Status:
Assigned
Target version:
-
[ruby-core:65024]

Description

malloc_usable_size shows up at or near the top of many profiles for me.

We should be able to use ruby_sized_xfree in more places; especially
if rb_data_type_t->dsize is defined.

One possible improvement is to allow the rb_data_type_t->dsize pointer
to be a FIXNUM, removing the need for some memsize functions.

Furthermore, over-accounting malloc-ed bytes (presumably the reason
malloc_usable_size was introduced). should be less harmful nowadays with
incremental marking.


Files

Updated by normalperson (Eric Wong) about 10 years ago

wrote:

One possible improvement is to allow the rb_data_type_t->dsize pointer
to be a FIXNUM, removing the need for some memsize functions.

Something like:
http://80x24.org/spew/m/20140928022441.GA24830%40dcvr.yhbt.net.txt

16 files changed, 49 insertions(+), 146 deletions(-) \o/

Updated by nobu (Nobuyoshi Nakada) about 10 years ago

I'm not sure if function pointers are guaranteed to be word-aligned.

Updated by nobu (Nobuyoshi Nakada) about 10 years ago

If it's a Fixnum, the size should be 0 for NULL ptr.

enc_memsize seems unnecessary anyway, indeed.

Updated by normalperson (Eric Wong) about 10 years ago

wrote:

If it's a Fixnum, the size should be 0 for NULL ptr.

I was under the impression DATA_PTR is not NULL in nearly all cases,
but I did not check closely for all those classes.

But this patch may not be possible anyways since some platforms may have
non-word-aligned functions :<

enc_memsize seems unnecessary anyway, indeed.

I'm a little curious about that, actually; but haven't gotten around
to investigating.

Updated by ko1 (Koichi Sasada) about 10 years ago

Eric Wong wrote:

malloc_usable_size shows up at or near the top of many profiles for me.

We can check the performance impact by enabling HAVE_MALLOC_USABLE_SIZE macro.

Try this program:

require 'benchmark'
Benchmark.bm{|x|
  10.times{
    x.report{
      10_000_000.times{ '*' * 260 }
    }
  }
}

Results:

enable HAVE_MALLOC_USABLE_SIZE:
       user     system      total        real
   3.140000   0.010000   3.150000 (  3.144509)
   3.130000   0.000000   3.130000 (  3.132572)
   3.130000   0.000000   3.130000 (  3.138391)
   3.170000   0.000000   3.170000 (  3.169465)
   3.150000   0.000000   3.150000 (  3.160397)
   3.140000   0.000000   3.140000 (  3.134969)
   3.150000   0.000000   3.150000 (  3.150813)
   3.130000   0.000000   3.130000 (  3.134149)
   3.190000   0.000000   3.190000 (  3.196077)
   3.130000   0.000000   3.130000 (  3.132020)

disable HAVE_MALLOC_USABLE_SIZE:
       user     system      total        real
   2.940000   0.010000   2.950000 (  2.965000)
   2.950000   0.000000   2.950000 (  2.953580)
   2.940000   0.000000   2.940000 (  2.952080)
   2.970000   0.000000   2.970000 (  2.964060)
   2.970000   0.000000   2.970000 (  2.970614)
   3.020000   0.000000   3.020000 (  3.023180)
   3.000000   0.000000   3.000000 (  3.006568)
   3.000000   0.000000   3.000000 (  2.993302)
   2.940000   0.000000   2.940000 (  2.951028)
   2.980000   0.000000   2.980000 (  2.987452)

(Ubuntu 14.1 on VirtualBox on Windows7)

It seems about 3.1 sec ->2.9 sec (7% speedup) by disabling this feature for an intentional case.

We should be able to use ruby_sized_xfree in more places; especially
if rb_data_type_t->dsize is defined.

One possible improvement is to allow the rb_data_type_t->dsize pointer
to be a FIXNUM, removing the need for some memsize functions.

T_DATA is not so many, so that I don't think the overhead of calling function is matter.

BTW, as nobu said, the function pointers can be located odd address. I encounterd such case on mswin32 build.

Furthermore, over-accounting malloc-ed bytes (presumably the reason
malloc_usable_size was introduced). should be less harmful nowadays with
incremental marking.

At first, the purpose of using malloc_usable_size() is to measure malloc'ed block correctly.
As you know, malloc() can return bigger memory block than we specified.
We need to compare the advantage (preciseness) and the disadvantage (performance down).

And I also agree that performance is more important :)

Current malloc_increase (and oldmalloc_increase) is not considered carefully.
(Current implementation is ad-hoc)
We need to re-consider about it (include avoiding atomic operations).

Updated by shyouhei (Shyouhei Urabe) about 5 years ago

Bump. I would like to revisit this. Do we need this many call to malloc_usable_size()?

Updated by ko1 (Koichi Sasada) about 5 years ago

It should be fixed.

Updated by ko1 (Koichi Sasada) almost 4 years ago

  • Assignee set to ko1 (Koichi Sasada)
  • Target version deleted (3.0)
Actions #10

Updated by hsbt (Hiroshi SHIBATA) 8 months ago

  • Status changed from Open to Assigned
Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0