Bug #13660

rb_str_hash_m discards bits from the hash

Added by Eregon (Benoit Daloze) about 3 years ago. Updated 12 months ago.

Target version:
ruby -v:
ruby 2.3.3p222 (2016-11-21 revision 56859) [x64-mingw32]


I believe rb_str_hash_m might discard some bits from the hash value in some situations.

It computes the hash as a st_index_t, which is either a unsigned long or a unsigned long long.
But the st_index_t value is converted to a VALUE with:
#define ST2FIX(h) LONG2FIX((long)(h))

Note that for instance on x64-mingw32, SIZEOF_LONG is 4, but SIZEOF_LONG_LONG and SIZEOF_VOIDP are 8 bytes.
So that truncates half the bits of the hash on such a platform if my understanding is correct.

Even is SIZEOF_LONG is 8, LONG2FIX loses the MSB I think, given that not all long can fit the Fixnum range on MRI (should it be LONG2NUM?).
Also, I am not sure if it is intended to cast from a unsigned value to a signed value.

I tried many things while debugging the rb_str_hash spec on ruby/spec and eventually gave up.
This computation looks wrong to me in MRI.

For info, here is my debug code:
and the build result on AppVeyor:

Also available in: Atom PDF