Bug #12198
closedHash#== sometimes returns false incorrectly
Description
Hi!
Sorry for lack of the accuracy in the bug title. I have some trouble with pinpointing the issue.
According to documentation, "two hashes are equal if they each contain the same number of keys and if each key-value pair is equal to (according to Object#==) the corresponding elements in the other hash." I was able to produce two hashes which satisfy this condition, however the method returns false. In other words, following happens:
e.class #=> Hash
r.class #=> Hash
e.size == r.size #=> true
e.each_pair.to_a == r.each_pair.to_a #=> true
e == r #=> false
That happens in Ruby 1.9.3, 2.3, 2.4 and probably in other versions as well. Pure Ruby, no gem could interfere.
Happy Easter ]:->
Files
Updated by skalee (Sebastian Skalacki) over 8 years ago
- Description updated (diff)
Updated by skalee (Sebastian Skalacki) over 8 years ago
- Description updated (diff)
Updated by mame (Yusuke Endoh) over 8 years ago
- Status changed from Open to Assigned
- Assignee set to knu (Akinori MUSHA)
Simplified.
require "set"
a = []
s1 = Set[a]
a << 42
s2 = Set[[42]]
p s2 #=> #<Set: {[42]}>
p s1 #=> #<Set: {[42]}>
p s2 == s1 #=> false
Modifying an element of a set causes this issue. I'm unsure if this is a bug.
--
Yusuke Endoh mame@ruby-lang.org
Updated by skalee (Sebastian Skalacki) over 8 years ago
Actually it has nothing to do with sets:
b = []
h1 = {b => true}
b << 42
h2 = {[42] => true}
p h2 #=> {[42]=>true}
p h1 #=> {[42]=>true}
p h2 == h1 #=> false
Updated by sawa (Tsuyoshi Sawada) over 8 years ago
Sebastian Skalacki wrote:
b = [] h1 = {b => true} b << 42 h2 = {[42] => true} p h2 #=> {[42]=>true} p h1 #=> {[42]=>true} p h2 == h1 #=> false
You have to apply Hash#rehash
.
h1.rehash
h2 == h1 # => true
And since Set
is based on Hash
, this carries over to Set
.
Yusuke Endoh wrote:
require "set" a = [] s1 = Set[a] a << 42 s2 = Set[[42]] p s2 #=> #<Set: {[42]}> p s1 #=> #<Set: {[42]}> p s2 == s1 #=> false
I think this is not a bug, but the remaining issue here, if any, is whether there should be a counterpart to rehash
for Set
. But I am not sure if Sebastian Skalacki would be asking for that.
Updated by skalee (Sebastian Skalacki) over 8 years ago
Yes, I believe that implementing Set#rehash
is a good idea (unless rehashing automatically when needed would be a better one). Furthermore, documentation is quite confusing on this topic. The caveat is indeed somewhat mentioned in the Hash#rehash
method description — but nowhere else.
Updated by shevegen (Robert A. Heiler) over 8 years ago
Here is Hash#rehash link:
http://ruby-doc.org/core-2.3.0/Hash.html#method-i-rehash
Perhaps the documentation could be updated regardless, to also notify the ruby user when .rehash may be useful. For instance, this is the first time that I read about two sets with the same content, may be considered not equal and that a .rehash can fix this behaviour displayed. I don't think I have actually seen .rehash used before yet, one can always learn something new in these bug reports. :)
Updated by skalee (Sebastian Skalacki) over 8 years ago
IMHO documentation on Hash#== is incorrect at the moment. It says:
Equality—Two hashes are equal if they each contain the same number of keys and if each key-value pair is equal to (according to Object#==) the corresponding elements in the other hash.
Which is definitely inconsistent with what have been observed and described in this ticket. Furthermore, two key-value pairs [k1, v1]
and [k2, v2]
are equal when v1 == v2 && k1.eql?(k2) && k1.hash == k2.hash
:
a = Object.new #=> #<Object:0x007f8d60eaf570>
b = Object.new #=> #<Object:0x007f8d63713458>
def a.eql? _ ; true ; end #=> :eql?
def b.eql? _ ; true ; end #=> :eql?
a.eql?(b) #=> true
{a => true} == {b => true} #=> false
def a.hash ; 1 ; end #=> :hash
def b.hash ; 1 ; end #=> :hash
{a => true} == {b => true} #=> true
The k1.hash == k2.hash
condition is actually an implication of how #eql?
is intended to work, the description of Object#hash
states that clearly:
Generates a Fixnum hash value for this object. This function must have the property that a.eql?(b) implies a.hash == b.hash.
Unfortunately, the description of Object#eql?
says something different:
The eql? method returns true if obj and other refer to the same hash key. This is used by Hash to test members for equality. For objects of class Object, eql? is synonymous with ==. Subclasses normally continue this tradition by aliasing eql? to their overridden == method, but there are exceptions. Numeric types, for example, perform type conversion across ==, but not across eql?, so: (example follows)
Therefore, it suggests that eql?
could be defined as:
def eql? other
self.hash == other.hash
end
Which is not true:
a = Object.new #=> #<Object:0x007fe18f819660>
b = Object.new #=> #<Object:0x007fe18e3dcc60>
def a.hash ; 44 ; end #=> :hash
def b.hash ; 44 ; end #=> :hash
a.eql? b #=> false
Finally, when it comes to Set#==
description:
Returns true if two sets are equal. The equality of each couple of elements is defined according to Object#eql?.
It is correct, although I believe it could be improved too. The need for #rehash
(when introduced) could be mentioned and the fact that #==
does not imply #eql?
could be emphasised. The latter is important because one could easily think that if some_array_1 == some_array_2
then some_array_1.to_set == some_array_2.to_set
, but this is not true.
To sum it all up:
Set#rehash
is required
I guess it's not controversial, or is it? Can I make a pull request?
Hash#==
description is seriously wrong
It says:
Equality—Two hashes are equal if they each contain the same number of keys and if each key-value pair is equal to (according to Object#==) the corresponding elements in the other hash.
It could say:
Equality—Two hashes are equal if they each contain the same number of keys and if each key-value pair is equal to (keys according to Object.eql?, values according to Object#==) the corresponding elements in the other hash.
Moreover, some notice about the need for Hash#rehash
is necessary. I'm not sure whether it should be inserted here or in the "Hash Keys" section of the class description.
Object#eql?
description is wrong
It says:
The eql? method returns true if obj and other refer to the same hash key. This is used by Hash to test members for equality. For objects of class Object, eql? is synonymous with ==. Subclasses normally continue this tradition by aliasing eql? to their overridden == method, but there are exceptions. Numeric types, for example, perform type conversion across ==, but not across eql?, so: (example follows)
It could say:
The eql? is used by Hash to test members for equality. This method must have the property that a.eql?(b) implies a.hash == b.hash. For objects of class Object, eql? is synonymous with ==. Subclasses normally continue this tradition by aliasing eql? to their overridden == method, but there are exceptions. Numeric types, for example, perform type conversion across ==, but not across eql?, so: (example follows)
Set#==
description is very good, but it could be improved as well
It says:
Returns true if two sets are equal. The equality of each couple of elements is defined according to Object#eql?.
It could say:
Returns true if two sets are equal. The equality of each couple of elements is defined according to Object#eql?. Please note that equality of two elements according to Object#== does not imply their equality according to Object#eql?.
Please improve English in the changes I've proposed.
Updated by sawa (Tsuyoshi Sawada) over 8 years ago
Sebastian Skalacki wrote:
Set#rehash
is required
I thought you had previously written:
Actually it has nothing to do with sets
Furthermore, is this even a bug?
Updated by skalee (Sebastian Skalacki) over 8 years ago
Tsuyoshi Sawada wrote:
I thought you had previously written:
Actually it has nothing to do with sets
I've reported it as an issue with Hash#==
method initially. The lack of Set#rehash
has been pointed out by you and I suppose it should be implemented.
Furthermore, is this even a bug?
The documentation on Hash#==
clearly describes different behaviour, therefore I've reported it as a bug.
Updated by knu (Akinori MUSHA) almost 8 years ago
The documentation of Set clearly states the following:
- Set assumes that the identity of each element does not change
while it is stored. Modifying an element of a set will render the
set to an unreliable state.
So it is not a bug, but that you are not supposed to modify an object once stored in a set.
Updated by knu (Akinori MUSHA) about 7 years ago
- Status changed from Assigned to Closed