Feature #10096
closed[PATCH] use khash for fstring and id_str tables
Description
frozen_strings and global_symbols.id_str hashes are two of the bigger
hashes in Ruby. They are needlessly ordered and incurs malloc overhead
in every st_table_entry.
Use an unordered open-addressing table which incurs no additional malloc
overhead besides the (larger) table itself.
Reduces "ruby -e exit" (w/RubyGems) by ~200K on eglibc malloc on
amd64 Debian stable due to having fewer allocations
global_symbols.str_id is left unchanged (for now) because it is used for
Symbol.all_symbols where ordering is expected
This introduces no user-visible changes or incompatibility
(unless I added a bug :x).
I chose khash because it is flexible and has (IMHO) a good API.
The API should also be familiar to mruby hackers, as mruby uses
a version of khash.
Future changes:
- covert smaller internal hashes where ordering is not exposed to Ruby users
- (hopefully) other hashes (methods/constants/ivars) hashes [Feature #9614]
(Note: tried a few times in lynx with 503 errors, never had this problem before
in lynx, trying clunky browser now)
Files
Updated by normalperson (Eric Wong) over 10 years ago
normalperson@yhbt.net wrote:
Feature #10096: [PATCH] use khash for fstring and id_str tables
https://bugs.ruby-lang.org/issues/10096
Updated for latest symbol.c changes in r46972
http://80x24.org/khash_v2.patch
P.S.: hsbt (or anybody else): Redmine gave me a 503 error when
attempting to upload the above patch (just a few minutes ago).
I also had problems last week, but uploading patches seems OK
on other tickets using lynx.
Client IP is 64.71.152.64 (dcvr.yhbt.net / 80x24.org server)
Updated by ko1 (Koichi Sasada) over 10 years ago
Eric Wong wrote:
frozen_strings and global_symbols.id_str hashes are two of the bigger
hashes in Ruby. They are needlessly ordered and incurs malloc overhead
in every st_table_entry.
We will change global_symbols.id_str (simple array) soon.
So do not touch it.
ID is now includes serial number. So use it.
But there are several issues to apply this simple approach.
For example, some IDs share same serial number.
But we (Nobu and I) discussed to solve them, and made up solutions.
Use an unordered open-addressing table which incurs no additional malloc
overhead besides the (larger) table itself.
Reduces "ruby -e exit" (w/RubyGems) by ~200K on eglibc malloc on
amd64 Debian stable due to having fewer allocations
Please write "from which size".
100MB -> (100MB-200KB) is trivial. 300KB->100KB is great work.
I chose khash because it is flexible and has (IMHO) a good API.
The API should also be familiar to mruby hackers, as mruby uses
a version of khash.
Did you check performance?
I know mruby uses it (and maybe more products).
So I assume it is sophisticated. How do you feel?
Updated by normalperson (Eric Wong) over 10 years ago
ko1@atdot.net wrote:
Eric Wong wrote:
frozen_strings and global_symbols.id_str hashes are two of the bigger
hashes in Ruby. They are needlessly ordered and incurs malloc overhead
in every st_table_entry.We will change global_symbols.id_str (simple array) soon.
So do not touch it.ID is now includes serial number. So use it.
But there are several issues to apply this simple approach.
For example, some IDs share same serial number.
But we (Nobu and I) discussed to solve them, and made up solutions.
OK, I look forward to your changes.
Use an unordered open-addressing table which incurs no additional malloc
overhead besides the (larger) table itself.Reduces "ruby -e exit" (w/RubyGems) by ~200K on eglibc malloc on
amd64 Debian stable due to having fewer allocationsPlease write "from which size".
100MB -> (100MB-200KB) is trivial. 300KB->100KB is great work.
The results go from around ~9300KB -> ~9100KB or so. I think
the variation is due to hash randomization, but there may be
other factors.
It used to be in the ~7300KB range a few months ago but I think GC
changes increased usage of initial startup...
I chose khash because it is flexible and has (IMHO) a good API.
The API should also be familiar to mruby hackers, as mruby uses
a version of khash.Did you check performance?
I know mruby uses it (and maybe more products).
So I assume it is sophisticated. How do you feel?
Our benchmark suite doesn't reveal much:
http://80x24.org/bmlog-20140803-231204.7455
Adding an unrealistic micro-bench shows st is faster:
http://80x24.org/fstring_iter_st.patch
http://80x24.org/fstring_iter_khash.patch
$ FSTRING_ITER=1000000 /usr/bin/time ./ruby.st -e exit
1.30user 0.00system 0:01.30elapsed 99%CPU (0avgtext+0avgdata 9240maxresident)k
0inputs+0outputs (0major+1348minor)pagefaults 0swaps
$ FSTRING_ITER=1000000 /usr/bin/time ./ruby.khash -e exit
1.60user 0.00system 0:01.61elapsed 99%CPU (0avgtext+0avgdata 9108maxresident)k
0inputs+0outputs (0major+1306minor)pagefaults 0swaps
Memory usage is from fstring change only, id_str untouched.
I don't think the performance is noticeable in real world, and the
memory/pagefault reduction is not big, either.
Your call :)
Updated by hsbt (Hiroshi SHIBATA) about 10 years ago
- Status changed from Open to Assigned
Updated by ko1 (Koichi Sasada) about 10 years ago
Sorry for delay.
I read your patch, and I'm not sure it is good idea to make klib directory.
This is only one thing I'm afraid.
Updated by normalperson (Eric Wong) about 10 years ago
- File khash-fstring-v3.patch khash-fstring-v3.patch added
updated patch:
- khash.h moved to top-level, klib/ removed
- updated for recent fstring-related changes in string.c
Updated by normalperson (Eric Wong) about 10 years ago
normalperson@yhbt.net wrote:
File khash-fstring-v3.patch added
OK to commit before 9/14? I think khash is the best for fstring
as we only store one VALUE (no extra allocation)
- covert smaller internal hashes where ordering is not exposed to Ruby users
- (hopefully) other hashes (methods/constants/ivars) hashes [Feature #9614]
Maybe ihash is better for method entries and constant entries because
they require a struct allocation.
I'm refreshing that series for methods + constants with ordering as a
split-out patch. We may keep ordering (at memory cost) since ccan/list
makes it easy and we still save memory from avoiding st_table_entry
allocations.
I'm not going to change ivars for now.
Updated by naruse (Yui NARUSE) almost 7 years ago
- Target version deleted (
2.2.0)
Updated by normalperson (Eric Wong) almost 7 years ago
- Status changed from Assigned to Rejected