Bug #17951
closedCollisions in Proc#hash values for blocks defined at the same line
Description
require 'set'
def capture(&block)
block
end
# it creates 1k of same blocks
blocks = Array.new(1000) { capture { :foo } }
hashes = blocks.map(&:hash).uniq
ids = blocks.map(&:object_id).uniq
equality = blocks.map { blocks[0].eql?(_1) }.tally
hash = blocks.to_h { [_1, nil] }
set = blocks.to_set
puts(hashes.size) # => 11
puts(ids.size) # => 1000
puts(equality.inspect) # => {true=>1, false=>999}
puts(hash.size) # => 1000
puts(set.size) # => 1000
The script builds one thousand blocks and then compares them in various ways. I would expect proc objects to be completely opaque and thus be treated as separate objects. As in, they are not equal. All tests but first confirm this expectation. However, Proc#hash
doesn't return 1000 different results rather it varies between 3 and 20 on my machine.
As I understand, current behavior doesn't violate ruby's guarantees. But I would expect Proc#hash
results to be as unique as Proc#object_id
, at least a lot more unique than they currently are.
The problem is likely to occur only for blocks defined at the same line.
ref to similar/related issue https://bugs.ruby-lang.org/issues/6048
Updated by xtkoba (Tee KOBAYASHI) over 3 years ago
A possible fix:
--- a/proc.c
+++ b/proc.c
@@ -1451,7 +1451,7 @@ rb_hash_proc(st_index_t hash, VALUE prc)
GetProcPtr(prc, proc);
hash = rb_hash_uint(hash, (st_index_t)proc->block.as.captured.code.val);
hash = rb_hash_uint(hash, (st_index_t)proc->block.as.captured.self);
- return rb_hash_uint(hash, (st_index_t)proc->block.as.captured.ep >> 16);
+ return rb_hash_uint(hash, (st_index_t)proc->block.as.captured.ep);
}
MJIT_FUNC_EXPORTED VALUE
I do not understand the meaning of the 16-bit right shift in the original code.
Updated by decuplet (Nikita Shilnikov) over 3 years ago
It was there since 1.9 as far as I can see https://github.com/ruby/ruby/commit/a3e1b1ce7ed7e7ffac23015fc2fde56511b30681#diff-2672918174f926386106967d117f11da8aa1905772dcf48fce53694386e4a666R658-R668
Updated by jeremyevans0 (Jeremy Evans) over 3 years ago
I have submitted a pull request with @xtkoba's fix: https://github.com/ruby/ruby/pull/4574
Updated by decuplet (Nikita Shilnikov) over 3 years ago
- Subject changed from Collistions in Proc#hash values for blocks defined at the same line to Collisions in Proc#hash values for blocks defined at the same line
Updated by jeremyevans (Jeremy Evans) over 3 years ago
- Status changed from Open to Closed
Applied in changeset git|be230615d016e27d5b45b465d1481f6ecf7f1d28.
Remove shift of ep when computing Proc#hash
The shift was causing far fewer unique values of hash than expected.
Fix pointed out by xtkoba (Tee KOBAYASHI)
Fixes [Bug #17951]