Project

General

Profile

Actions

Bug #12198

closed

Hash#== sometimes returns false incorrectly

Added by skalee (Sebastian Skalacki) over 8 years ago. Updated about 7 years ago.

Status:
Closed
Target version:
-
ruby -v:
ruby 2.4.0dev (2016-03-11 trunk 54086) [x86_64-darwin14]
[ruby-core:74460]

Description

Hi!

Sorry for lack of the accuracy in the bug title. I have some trouble with pinpointing the issue.

According to documentation, "two hashes are equal if they each contain the same number of keys and if each key-value pair is equal to (according to Object#==) the corresponding elements in the other hash." I was able to produce two hashes which satisfy this condition, however the method returns false. In other words, following happens:

e.class #=> Hash
r.class #=> Hash
e.size == r.size #=> true
e.each_pair.to_a == r.each_pair.to_a #=> true
e == r #=> false

That happens in Ruby 1.9.3, 2.3, 2.4 and probably in other versions as well. Pure Ruby, no gem could interfere.

Happy Easter ]:->


Files

problem.rb (1.69 KB) problem.rb Piece of code which instantiates that problematic Hash and illustrates the issue skalee (Sebastian Skalacki), 03/19/2016 01:31 PM

Updated by skalee (Sebastian Skalacki) over 8 years ago

  • Description updated (diff)

Updated by skalee (Sebastian Skalacki) over 8 years ago

  • Description updated (diff)

Updated by mame (Yusuke Endoh) over 8 years ago

  • Status changed from Open to Assigned
  • Assignee set to knu (Akinori MUSHA)

Simplified.

require "set"

a = []
s1 = Set[a]
a << 42

s2 = Set[[42]]

p s2 #=> #<Set: {[42]}>
p s1 #=> #<Set: {[42]}>
p s2 == s1 #=> false

Modifying an element of a set causes this issue. I'm unsure if this is a bug.

--
Yusuke Endoh

Updated by skalee (Sebastian Skalacki) over 8 years ago

Actually it has nothing to do with sets:

b = []
h1 = {b => true}
b << 42

h2 = {[42] => true}

p h2 #=> {[42]=>true}
p h1 #=> {[42]=>true}
p h2 == h1 #=> false

Updated by sawa (Tsuyoshi Sawada) over 8 years ago

Sebastian Skalacki wrote:

b = []
h1 = {b => true}
b << 42

h2 = {[42] => true}

p h2 #=> {[42]=>true}
p h1 #=> {[42]=>true}
p h2 == h1 #=> false

You have to apply Hash#rehash.

h1.rehash
h2 == h1 # => true

And since Set is based on Hash, this carries over to Set.

Yusuke Endoh wrote:

require "set"

a = []
s1 = Set[a]
a << 42

s2 = Set[[42]]

p s2 #=> #<Set: {[42]}>
p s1 #=> #<Set: {[42]}>
p s2 == s1 #=> false

I think this is not a bug, but the remaining issue here, if any, is whether there should be a counterpart to rehash for Set. But I am not sure if Sebastian Skalacki would be asking for that.

Updated by skalee (Sebastian Skalacki) over 8 years ago

Yes, I believe that implementing Set#rehash is a good idea (unless rehashing automatically when needed would be a better one). Furthermore, documentation is quite confusing on this topic. The caveat is indeed somewhat mentioned in the Hash#rehash method description — but nowhere else.

Updated by shevegen (Robert A. Heiler) over 8 years ago

Here is Hash#rehash link:

http://ruby-doc.org/core-2.3.0/Hash.html#method-i-rehash

Perhaps the documentation could be updated regardless, to also notify the ruby user when .rehash may be useful. For instance, this is the first time that I read about two sets with the same content, may be considered not equal and that a .rehash can fix this behaviour displayed. I don't think I have actually seen .rehash used before yet, one can always learn something new in these bug reports. :)

Updated by skalee (Sebastian Skalacki) over 8 years ago

IMHO documentation on Hash#== is incorrect at the moment. It says:

Equality—Two hashes are equal if they each contain the same number of keys and if each key-value pair is equal to (according to Object#==) the corresponding elements in the other hash.

Which is definitely inconsistent with what have been observed and described in this ticket. Furthermore, two key-value pairs [k1, v1] and [k2, v2] are equal when v1 == v2 && k1.eql?(k2) && k1.hash == k2.hash:

a = Object.new   #=> #<Object:0x007f8d60eaf570>
b = Object.new   #=> #<Object:0x007f8d63713458>
def a.eql? _ ; true ; end   #=> :eql?
def b.eql? _ ; true ; end   #=> :eql?
a.eql?(b)   #=> true
{a => true} == {b => true}   #=> false
def a.hash ; 1 ; end   #=> :hash
def b.hash ; 1 ; end   #=> :hash
{a => true} == {b => true}   #=> true

The k1.hash == k2.hash condition is actually an implication of how #eql? is intended to work, the description of Object#hash states that clearly:

Generates a Fixnum hash value for this object. This function must have the property that a.eql?(b) implies a.hash == b.hash.

Unfortunately, the description of Object#eql? says something different:

The eql? method returns true if obj and other refer to the same hash key. This is used by Hash to test members for equality. For objects of class Object, eql? is synonymous with ==. Subclasses normally continue this tradition by aliasing eql? to their overridden == method, but there are exceptions. Numeric types, for example, perform type conversion across ==, but not across eql?, so: (example follows)

Therefore, it suggests that eql? could be defined as:

def eql? other
  self.hash == other.hash
end

Which is not true:

a = Object.new   #=> #<Object:0x007fe18f819660>
b = Object.new   #=> #<Object:0x007fe18e3dcc60>
def a.hash ; 44 ; end   #=> :hash
def b.hash ; 44 ; end   #=> :hash
a.eql? b   #=> false

Finally, when it comes to Set#== description:

Returns true if two sets are equal. The equality of each couple of elements is defined according to Object#eql?.

It is correct, although I believe it could be improved too. The need for #rehash (when introduced) could be mentioned and the fact that #== does not imply #eql? could be emphasised. The latter is important because one could easily think that if some_array_1 == some_array_2 then some_array_1.to_set == some_array_2.to_set, but this is not true.

To sum it all up:

Set#rehash is required

I guess it's not controversial, or is it? Can I make a pull request?

Hash#== description is seriously wrong

It says:

Equality—Two hashes are equal if they each contain the same number of keys and if each key-value pair is equal to (according to Object#==) the corresponding elements in the other hash.

It could say:

Equality—Two hashes are equal if they each contain the same number of keys and if each key-value pair is equal to (keys according to Object.eql?, values according to Object#==) the corresponding elements in the other hash.

Moreover, some notice about the need for Hash#rehash is necessary. I'm not sure whether it should be inserted here or in the "Hash Keys" section of the class description.

Object#eql? description is wrong

It says:

The eql? method returns true if obj and other refer to the same hash key. This is used by Hash to test members for equality. For objects of class Object, eql? is synonymous with ==. Subclasses normally continue this tradition by aliasing eql? to their overridden == method, but there are exceptions. Numeric types, for example, perform type conversion across ==, but not across eql?, so: (example follows)

It could say:

The eql? is used by Hash to test members for equality. This method must have the property that a.eql?(b) implies a.hash == b.hash. For objects of class Object, eql? is synonymous with ==. Subclasses normally continue this tradition by aliasing eql? to their overridden == method, but there are exceptions. Numeric types, for example, perform type conversion across ==, but not across eql?, so: (example follows)

Set#== description is very good, but it could be improved as well

It says:

Returns true if two sets are equal. The equality of each couple of elements is defined according to Object#eql?.

It could say:

Returns true if two sets are equal. The equality of each couple of elements is defined according to Object#eql?. Please note that equality of two elements according to Object#== does not imply their equality according to Object#eql?.

Please improve English in the changes I've proposed.

Updated by sawa (Tsuyoshi Sawada) over 8 years ago

Sebastian Skalacki wrote:

Set#rehash is required

I thought you had previously written:

Actually it has nothing to do with sets

Furthermore, is this even a bug?

Updated by skalee (Sebastian Skalacki) over 8 years ago

Tsuyoshi Sawada wrote:

I thought you had previously written:

Actually it has nothing to do with sets

I've reported it as an issue with Hash#== method initially. The lack of Set#rehash has been pointed out by you and I suppose it should be implemented.

Furthermore, is this even a bug?

The documentation on Hash#== clearly describes different behaviour, therefore I've reported it as a bug.

Updated by knu (Akinori MUSHA) almost 8 years ago

The documentation of Set clearly states the following:

  • Set assumes that the identity of each element does not change
    while it is stored. Modifying an element of a set will render the
    set to an unreliable state.

So it is not a bug, but that you are not supposed to modify an object once stored in a set.

Actions #12

Updated by knu (Akinori MUSHA) about 7 years ago

  • Status changed from Assigned to Closed
Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0