Bug #19113
closedInconsistency in retention of compare_by_identity flag in Hash methods
Description
Hash.[]
and Hash.ruby2_keywords_hash
retain the compare_by_identity flag for non-empty hashes, but do not retain it for empty hashes:
hs = [{}.compare_by_identity, {:a=>1}.compare_by_identity]
hs.map{|h| Hash[h].compare_by_identity?}
# => [false, true]
hs.map{|h| Hash.ruby2_keywords_hash(h).compare_by_identity?}
# => [false, true]
This inconsistency seems like a bug.
Hash#compact
always drops the compare_by_identity flag, but it is documented as returning a copy of self, implying the compare_by_identity flag is kept (since #dup and #clone retain the flag).
{}.compare_by_identity.compact.compare_by_identity?
# => false
I'm not sure whether is a bug, because it is consistent, but I think retaining the flag makes more sense.
I'll try to work on a fix for both of these issues tomorrow.
Updated by jeremyevans0 (Jeremy Evans) almost 2 years ago
I think the following behavior makes the most sense:
-
Hash.[]
should never retain the compare_by_identity flag. It doesn't copy the default value/proc, so retaining the compare_by_identity flag does not make sense. -
Hash.ruby2_keywords_hash
should always retain the compare_by_identity flag. It copies the default value/proc, so retaining the compare_by_identity flag makes sense. -
Hash#compact
should copy the default value/proc and retain the compare_by_identity flag. It currently does neither. However, the documentation says it returns a copy of the receiver, which implies the result should have the same default value/proc and compare_by_identity flag as the receiver.
I've submitted a pull request for these changes: https://github.com/ruby/ruby/pull/6702
Updated by ko1 (Koichi Sasada) almost 2 years ago
I think Hash[]
remains all keys on the new Hash so compare_by_identity
should be remained.
Updated by headius (Charles Nutter) almost 2 years ago
jeremyevans0 (Jeremy Evans) wrote in #note-1:
I think the following behavior makes the most sense:
Hash.[]
should never retain the compare_by_identity flag. It doesn't copy the default value/proc, so retaining the compare_by_identity flag does not make sense.
I disagree. The contents of the other hash will have been populated using identity comparison rather than equality comparison. If that characteristic does not propagate to the new hash, any == keys will collide or else the set of keys will have to change. Is Hash.[]
intended to make a copy, or a new hash populated with a subset of the original keys using non-identity, non-default semantics?
I agree with your other points.
Updated by headius (Charles Nutter) almost 2 years ago
FWIW I ran into this while implementing Ruby 3.1 identhash semantics in JRuby, and then realized this is now one of the few "copy" paths that does not propagate identity comparison.
Updated by jeremyevans0 (Jeremy Evans) almost 2 years ago
headius (Charles Nutter) wrote in #note-3:
jeremyevans0 (Jeremy Evans) wrote in #note-1:
I think the following behavior makes the most sense:
Hash.[]
should never retain the compare_by_identity flag. It doesn't copy the default value/proc, so retaining the compare_by_identity flag does not make sense.I disagree. The contents of the other hash will have been populated using identity comparison rather than equality comparison. If that characteristic does not propagate to the new hash, any == keys will collide or else the set of keys will have to change. Is
Hash.[]
intended to make a copy, or a new hash populated with a subset of the original keys using non-identity, non-default semantics?
It's explicitly documented as returning a new hash, not a copy: Returns a new Hash object populated with the given objects
.
The fact that it doesn't copy the default value/proc indicates to me that the compare_by_identity flag should not be copied either. However, I don't feel strongly regarding this. We can make it so the compare_by_identity flag is always copied. We do need to make some change, because the current behavior is inconsistent.
Updated by headius (Charles Nutter) almost 2 years ago
Even though it's a "new Hash", it is supposed to be populated with "the given objects". If losing identity comparison means some keys don't get populated, I would consider that broken.
Updated by jeremyevans0 (Jeremy Evans) almost 2 years ago
headius (Charles Nutter) wrote in #note-6:
Even though it's a "new Hash", it is supposed to be populated with "the given objects". If losing identity comparison means some keys don't get populated, I would consider that broken.
If you turn the hash into an array, you have the same given objects
, but different results:
h = {'a'=>1}.compare_by_identity
h['a'] = 2
Hash[h] # {'a'=>1, 'a'=>2}.compare_by_identity
Hash[h.to_a] # {'a'=>2}
Not respecting the compare_by_identity flag (as my PR does) would make Hash[h] == Hash[h.to_a]
.
I would say that populated with the given objects
does not imply resulting in the same set of keys.
Also, given objects
(not given object
) implies it looks at the content of the hash, but not the hash object itself (hence why default value/proc is not copied).
Here's the pseudocode I would consider based on the documentation:
# new hash
output_hash = {}
# populated with the given objects
input_hash_or_array.each do |k, v|
output_hash[k] = v
end
output_hash
There are certainly valid arguments for both sides, so we'll have to see what @matz (Yukihiro Matsumoto) decides.
Updated by Eregon (Benoit Daloze) almost 2 years ago
The way I see it, Hash[*args]
converts args to a Hash, and args is only considered as key-value pairs. So if args = [some_Hash]
, it's just one of the representation of key-value pairs but the behavior shouldn't be special with a Hash argument otherwise.
IOW I agree with Jeremy here. To copy a Hash there is .dup
/.clone
.
Updated by matz (Yukihiro Matsumoto) almost 2 years ago
I agree with @jeremyevans0 (Jeremy Evans) too.
Matz.
Updated by jeremyevans (Jeremy Evans) over 1 year ago
- Status changed from Open to Closed
Applied in changeset git|d3197def882b47e7c57cdddfe8d62f62fef9d3f7.
Do not copy compare_by_identity flag for non-empty hashes in Hash.[]
It wasn't copied for empty hashes, and Hash.[] doesn't copy the
default value, so copying the compare_by_identity flag does not
make sense.
Partially Fixes [Bug #19113]