Project

General

Profile

Actions

Bug #17951

closed

Collisions in Proc#hash values for blocks defined at the same line

Added by decuplet (Nikita Shilnikov) almost 3 years ago. Updated almost 3 years ago.

Status:
Closed
Assignee:
-
Target version:
-
ruby -v:
ruby 3.0.1p64 (2021-04-05 revision 0fb782ee38) [x86_64-darwin20]
[ruby-core:104248]

Description

require 'set'

def capture(&block)
  block
end

# it creates 1k of same blocks
blocks = Array.new(1000) { capture { :foo } }

hashes = blocks.map(&:hash).uniq
ids = blocks.map(&:object_id).uniq
equality = blocks.map { blocks[0].eql?(_1) }.tally
hash = blocks.to_h { [_1, nil] }
set = blocks.to_set

puts(hashes.size)      # => 11
puts(ids.size)         # => 1000
puts(equality.inspect) # => {true=>1, false=>999}
puts(hash.size)        # => 1000
puts(set.size)         # => 1000

The script builds one thousand blocks and then compares them in various ways. I would expect proc objects to be completely opaque and thus be treated as separate objects. As in, they are not equal. All tests but first confirm this expectation. However, Proc#hash doesn't return 1000 different results rather it varies between 3 and 20 on my machine.

As I understand, current behavior doesn't violate ruby's guarantees. But I would expect Proc#hash results to be as unique as Proc#object_id, at least a lot more unique than they currently are.

The problem is likely to occur only for blocks defined at the same line.

ref to similar/related issue https://bugs.ruby-lang.org/issues/6048

Updated by xtkoba (Tee KOBAYASHI) almost 3 years ago

A possible fix:

--- a/proc.c
+++ b/proc.c
@@ -1451,7 +1451,7 @@ rb_hash_proc(st_index_t hash, VALUE prc)
     GetProcPtr(prc, proc);
     hash = rb_hash_uint(hash, (st_index_t)proc->block.as.captured.code.val);
     hash = rb_hash_uint(hash, (st_index_t)proc->block.as.captured.self);
-    return rb_hash_uint(hash, (st_index_t)proc->block.as.captured.ep >> 16);
+    return rb_hash_uint(hash, (st_index_t)proc->block.as.captured.ep);
 }
 
 MJIT_FUNC_EXPORTED VALUE

I do not understand the meaning of the 16-bit right shift in the original code.

Updated by jeremyevans0 (Jeremy Evans) almost 3 years ago

I have submitted a pull request with @xtkoba's fix: https://github.com/ruby/ruby/pull/4574

Actions #4

Updated by decuplet (Nikita Shilnikov) almost 3 years ago

  • Subject changed from Collistions in Proc#hash values for blocks defined at the same line to Collisions in Proc#hash values for blocks defined at the same line
Actions #5

Updated by jeremyevans (Jeremy Evans) almost 3 years ago

  • Status changed from Open to Closed

Applied in changeset git|be230615d016e27d5b45b465d1481f6ecf7f1d28.


Remove shift of ep when computing Proc#hash

The shift was causing far fewer unique values of hash than expected.

Fix pointed out by xtkoba (Tee KOBAYASHI)

Fixes [Bug #17951]

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0