Bug #12198: Hash#== sometimes returns false incorrectly - Ruby - Ruby Issue Tracking System

Actions

Copy link

Bug #12198

closed

Hash#== sometimes returns false incorrectly

Added by skalee (Sebastian Skalacki) over 9 years ago. Updated over 7 years ago.

Status:

Closed

Assignee:

knu (Akinori MUSHA)

Target version:

ruby -v:

ruby 2.4.0dev (2016-03-11 trunk 54086) [x86_64-darwin14]

Backport:

2.1: UNKNOWN, 2.2: UNKNOWN, 2.3: UNKNOWN

[ruby-core:74460]

Description

Hi!

Sorry for lack of the accuracy in the bug title. I have some trouble with pinpointing the issue.

According to documentation, "two hashes are equal if they each contain the same number of keys and if each key-value pair is equal to (according to Object#==) the corresponding elements in the other hash." I was able to produce two hashes which satisfy this condition, however the method returns false. In other words, following happens:

e.class #=> Hash
r.class #=> Hash
e.size == r.size #=> true
e.each_pair.to_a == r.each_pair.to_a #=> true
e == r #=> false

That happens in Ruby 1.9.3, 2.3, 2.4 and probably in other versions as well. Pure Ruby, no gem could interfere.

Happy Easter ]:->

Files

problem.rb (1.69 KB) problem.rb

Piece of code which instantiates that problematic Hash and illustrates the issue

skalee (Sebastian Skalacki), 03/19/2016 01:31 PM

Actions

Copy link

#1 [ruby-core:74461]

Updated by skalee (Sebastian Skalacki) over 9 years ago

Description updated (diff)

Actions

Copy link

#2 [ruby-core:74462]

Updated by skalee (Sebastian Skalacki) over 9 years ago

Description updated (diff)

Actions

Copy link

#3 [ruby-core:74464]

Updated by mame (Yusuke Endoh) over 9 years ago

Status changed from Open to Assigned
Assignee set to knu (Akinori MUSHA)

Simplified.

require "set"

a = []
s1 = Set[a]
a << 42

s2 = Set[[42]]

p s2 #=> #<Set: {[42]}>
p s1 #=> #<Set: {[42]}>
p s2 == s1 #=> false

Modifying an element of a set causes this issue. I'm unsure if this is a bug.

--
Yusuke Endoh mame@ruby-lang.org

Actions

Copy link

#4 [ruby-core:74465]

Updated by skalee (Sebastian Skalacki) over 9 years ago

Actually it has nothing to do with sets:

b = []
h1 = {b => true}
b << 42

h2 = {[42] => true}

p h2 #=> {[42]=>true}
p h1 #=> {[42]=>true}
p h2 == h1 #=> false

Actions

Copy link

#5 [ruby-core:74466]

Updated by sawa (Tsuyoshi Sawada) over 9 years ago

Sebastian Skalacki wrote:

b = []
h1 = {b => true}
b << 42

h2 = {[42] => true}

p h2 #=> {[42]=>true}
p h1 #=> {[42]=>true}
p h2 == h1 #=> false

You have to apply Hash#rehash.

h1.rehash
h2 == h1 # => true

And since Set is based on Hash, this carries over to Set.

Yusuke Endoh wrote:

require "set"

a = []
s1 = Set[a]
a << 42

s2 = Set[[42]]

p s2 #=> #<Set: {[42]}>
p s1 #=> #<Set: {[42]}>
p s2 == s1 #=> false

I think this is not a bug, but the remaining issue here, if any, is whether there should be a counterpart to rehash for Set. But I am not sure if Sebastian Skalacki would be asking for that.

Actions

Copy link

#6 [ruby-core:74468]

Updated by skalee (Sebastian Skalacki) over 9 years ago

Yes, I believe that implementing Set#rehash is a good idea (unless rehashing automatically when needed would be a better one). Furthermore, documentation is quite confusing on this topic. The caveat is indeed somewhat mentioned in the Hash#rehash method description — but nowhere else.

Actions

Copy link

#7 [ruby-core:74476]

Updated by shevegen (Robert A. Heiler) over 9 years ago

Here is Hash#rehash link:

http://ruby-doc.org/core-2.3.0/Hash.html#method-i-rehash

Perhaps the documentation could be updated regardless, to also notify the ruby user when .rehash may be useful. For instance, this is the first time that I read about two sets with the same content, may be considered not equal and that a .rehash can fix this behaviour displayed. I don't think I have actually seen .rehash used before yet, one can always learn something new in these bug reports. :)

Actions

Copy link

#8 [ruby-core:74590]

Updated by skalee (Sebastian Skalacki) over 9 years ago

IMHO documentation on Hash#== is incorrect at the moment. It says:

Equality—Two hashes are equal if they each contain the same number of keys and if each key-value pair is equal to (according to Object#==) the corresponding elements in the other hash.

Which is definitely inconsistent with what have been observed and described in this ticket. Furthermore, two key-value pairs [k1, v1] and [k2, v2] are equal when v1 == v2 && k1.eql?(k2) && k1.hash == k2.hash:

a = Object.new   #=> #<Object:0x007f8d60eaf570>
b = Object.new   #=> #<Object:0x007f8d63713458>
def a.eql? _ ; true ; end   #=> :eql?
def b.eql? _ ; true ; end   #=> :eql?
a.eql?(b)   #=> true
{a => true} == {b => true}   #=> false
def a.hash ; 1 ; end   #=> :hash
def b.hash ; 1 ; end   #=> :hash
{a => true} == {b => true}   #=> true

The k1.hash == k2.hash condition is actually an implication of how #eql? is intended to work, the description of Object#hash states that clearly:

Generates a Fixnum hash value for this object. This function must have the property that a.eql?(b) implies a.hash == b.hash.

Unfortunately, the description of Object#eql? says something different:

The eql? method returns true if obj and other refer to the same hash key. This is used by Hash to test members for equality. For objects of class Object, eql? is synonymous with ==. Subclasses normally continue this tradition by aliasing eql? to their overridden == method, but there are exceptions. Numeric types, for example, perform type conversion across ==, but not across eql?, so: (example follows)

Therefore, it suggests that eql? could be defined as:

def eql? other
  self.hash == other.hash
end

Which is not true:

a = Object.new   #=> #<Object:0x007fe18f819660>
b = Object.new   #=> #<Object:0x007fe18e3dcc60>
def a.hash ; 44 ; end   #=> :hash
def b.hash ; 44 ; end   #=> :hash
a.eql? b   #=> false

Finally, when it comes to Set#== description:

Returns true if two sets are equal. The equality of each couple of elements is defined according to Object#eql?.

It is correct, although I believe it could be improved too. The need for #rehash (when introduced) could be mentioned and the fact that #== does not imply #eql? could be emphasised. The latter is important because one could easily think that if some_array_1 == some_array_2 then some_array_1.to_set == some_array_2.to_set, but this is not true.

To sum it all up:

`Set#rehash` is required¶

I guess it's not controversial, or is it? Can I make a pull request?

`Hash#==` description is seriously wrong¶

It says:

Equality—Two hashes are equal if they each contain the same number of keys and if each key-value pair is equal to (according to Object#==) the corresponding elements in the other hash.

It could say:

Equality—Two hashes are equal if they each contain the same number of keys and if each key-value pair is equal to (keys according to Object.eql?, values according to Object#==) the corresponding elements in the other hash.

Moreover, some notice about the need for Hash#rehash is necessary. I'm not sure whether it should be inserted here or in the "Hash Keys" section of the class description.

`Object#eql?` description is wrong¶

It says:

The eql? method returns true if obj and other refer to the same hash key. This is used by Hash to test members for equality. For objects of class Object, eql? is synonymous with ==. Subclasses normally continue this tradition by aliasing eql? to their overridden == method, but there are exceptions. Numeric types, for example, perform type conversion across ==, but not across eql?, so: (example follows)

It could say:

The eql? is used by Hash to test members for equality. This method must have the property that a.eql?(b) implies a.hash == b.hash. For objects of class Object, eql? is synonymous with ==. Subclasses normally continue this tradition by aliasing eql? to their overridden == method, but there are exceptions. Numeric types, for example, perform type conversion across ==, but not across eql?, so: (example follows)

`Set#==` description is very good, but it could be improved as well¶

It says:

Returns true if two sets are equal. The equality of each couple of elements is defined according to Object#eql?.

It could say:

Returns true if two sets are equal. The equality of each couple of elements is defined according to Object#eql?. Please note that equality of two elements according to Object#== does not imply their equality according to Object#eql?.

Please improve English in the changes I've proposed.

Actions

Copy link

#9 [ruby-core:74592]

Updated by sawa (Tsuyoshi Sawada) over 9 years ago

Sebastian Skalacki wrote:

Set#rehash is required

I thought you had previously written:

Actually it has nothing to do with sets

Furthermore, is this even a bug?

Actions

Copy link

#10 [ruby-core:74593]

Updated by skalee (Sebastian Skalacki) over 9 years ago

Tsuyoshi Sawada wrote:

I thought you had previously written:

Actually it has nothing to do with sets

I've reported it as an issue with Hash#== method initially. The lack of Set#rehash has been pointed out by you and I suppose it should be implemented.

Furthermore, is this even a bug?

The documentation on Hash#== clearly describes different behaviour, therefore I've reported it as a bug.

Actions

Copy link

#11 [ruby-core:77975]

Updated by knu (Akinori MUSHA) over 8 years ago

The documentation of Set clearly states the following:

Set assumes that the identity of each element does not change
while it is stored. Modifying an element of a set will render the
set to an unreliable state.

So it is not a bug, but that you are not supposed to modify an object once stored in a set.

Actions

Copy link

#12

Updated by knu (Akinori MUSHA) over 7 years ago

Status changed from Assigned to Closed

Actions

Copy link

Also available in: Atom PDF

Like0

Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0

Project

General

Profile

Ruby

Tags

Custom queries

Bug #12198

Hash#== sometimes returns false incorrectly

Updated by skalee (Sebastian Skalacki) over 9 years ago

Updated by skalee (Sebastian Skalacki) over 9 years ago

Updated by mame (Yusuke Endoh) over 9 years ago

Updated by skalee (Sebastian Skalacki) over 9 years ago

Updated by sawa (Tsuyoshi Sawada) over 9 years ago

Updated by skalee (Sebastian Skalacki) over 9 years ago

Updated by shevegen (Robert A. Heiler) over 9 years ago

Updated by skalee (Sebastian Skalacki) over 9 years ago

`Set#rehash` is required¶

`Hash#==` description is seriously wrong¶

`Object#eql?` description is wrong¶

`Set#==` description is very good, but it could be improved as well¶

Updated by sawa (Tsuyoshi Sawada) over 9 years ago

Updated by skalee (Sebastian Skalacki) over 9 years ago

Updated by knu (Akinori MUSHA) over 8 years ago

Updated by knu (Akinori MUSHA) over 7 years ago

Project

General

Profile

Ruby

Tags

Custom queries

Bug #12198

Hash#== sometimes returns false incorrectly

Updated by skalee (Sebastian Skalacki) over 9 years ago

Updated by skalee (Sebastian Skalacki) over 9 years ago

Updated by mame (Yusuke Endoh) over 9 years ago

Updated by skalee (Sebastian Skalacki) over 9 years ago

Updated by sawa (Tsuyoshi Sawada) over 9 years ago

Updated by skalee (Sebastian Skalacki) over 9 years ago

Updated by shevegen (Robert A. Heiler) over 9 years ago

Updated by skalee (Sebastian Skalacki) over 9 years ago

Set#rehash is required¶

Hash#== description is seriously wrong¶

Object#eql? description is wrong¶

Set#== description is very good, but it could be improved as well¶

Updated by sawa (Tsuyoshi Sawada) over 9 years ago

Updated by skalee (Sebastian Skalacki) over 9 years ago

Updated by knu (Akinori MUSHA) over 8 years ago

Updated by knu (Akinori MUSHA) over 7 years ago

`Set#rehash` is required¶

`Hash#==` description is seriously wrong¶

`Object#eql?` description is wrong¶

`Set#==` description is very good, but it could be improved as well¶