Project

General

Profile

Actions

Feature #9638

closed

[PATCH] limit IDs to 32-bits on 64-bit systems

Added by normalperson (Eric Wong) over 9 years ago. Updated over 9 years ago.

Status:
Rejected
Priority:
Normal
Assignee:
-
Target version:
[ruby-core:61496]

Description

This should allow better use of cache-friendly lookup mechanisms such as
funny_falcon's sparse array in [ruby-core:55079]

Also limits symbol space to prevent OOM.

Some structs may also be made smaller as a result (rb_method_entry_t).

We're changing ABI for 2.2.0 anyways, so this is a good time to introduce
this change.


Files

0001-ID-is-always-uint32_t.patch (3.62 KB) 0001-ID-is-always-uint32_t.patch normalperson (Eric Wong), 03/14/2014 07:06 PM

Related issues 1 (0 open1 closed)

Related to Ruby master - Feature #11420: Introduce ID key table into MRIClosedko1 (Koichi Sasada)Actions

Updated by normalperson (Eric Wong) over 9 years ago

sparse array is described in ruby-core:55079

Updated by normalperson (Eric Wong) over 9 years ago

I'm not sure if this is possible anymore due to SymbolGC
No big deal, though.

Updated by ngoto (Naohisa Goto) over 9 years ago

I'm using machines that have 2TB or more main memory. I think the machines can treat more than 2**32 symbols and I want to use full 64-bit capacity.

Updated by normalperson (Eric Wong) over 9 years ago

I am OK with closing this issue (but I'm not sure if I have permissions
to close on redmine).

However, your applications need more than 2**32 different symbols?
That scares me :*(
How much memory do your Ruby processes use?

The Symbol table currently takes at least (48 + 48 + 40 = 136) bytes per
symbol on 64-bit, so 136 * (2 ** 32) is 544 gigabytes just for the
symbol table (w/fstrings) in your app. That does not even account for
memory of symbols with string representations longer than 23 bytes,
nor the memory for hash table buckets.

I need to know because I am also looking into using khash[1] for the
symbol table. By default, khash internal buckets/counters are all
32-bits. We can tweak khash to use 64-bit counters if needed,
but 2**32 symbols really should be enough.

The symbol table with khash might reduce memory overhead to ~90 bytes
per-symbol on average, though...

[1] git clone https://github.com/attractivechaos/klib.git
mruby also uses khash for (all?) its hash table needs.

Updated by normalperson (Eric Wong) over 9 years ago

  • Status changed from Open to Rejected

Updated by ko1 (Koichi Sasada) over 9 years ago

(2014/03/15 4:07), wrote:

Also limits symbol space to prevent OOM.

What is OOM?
Out of memory?

Symbol GC doesn't help?

--
// SASADA Koichi at atdot dot net

Updated by normalperson (Eric Wong) over 9 years ago

SASADA Koichi wrote:

(2014/03/15 4:07), wrote:

Also limits symbol space to prevent OOM.

What is OOM?
Out of memory?

Yes, out-of-memory.

Symbol GC doesn't help?

It does; but OOM was a secondary concern of mine.

I mainly wanted 32-bit ID so it might be easier to pack some structs
on 64-bit machines. 64-bit ID is not a big issue, though.

Actions #8

Updated by ngoto (Naohisa Goto) about 8 years ago

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0Like0