Project

General

Profile

Actions

Feature #21858

closed

`Kernel#Hash` considers `to_h` too

Feature #21858: `Kernel#Hash` considers `to_h` too

Added by ccmywish (Aoran Zeng) about 2 months ago. Updated 7 days ago.

Status:
Feedback
Assignee:
-
Target version:
-
[ruby-core:124660]

Description

  1. Kernel#Integer uses to_int first and to_i second
  2. Kernel#Array uses to_ary first and to_a second
  3. Kernel#Hash only uses to_hash

I don't quite understand why there is a need for differential treatment here.

I admit that maybe the only benefit of considering to_h secondly is that it enables multiple APIs to maintain consistency.

Updated by Dan0042 (Daniel DeLorme) about 2 months ago Actions #2 [ruby-core:124663]

+1
Actually I think this is just an oversight. #to_h was added to Struct, Hash, NilClass in Ruby 2.0, and to Array, Enumerable in Ruby 2.1; previously there was just #to_hash. And Hash() remained unchanged instead of adapting to the new convention. I don't think that was a conscious decision.

Updated by matz (Yukihiro Matsumoto) about 1 month ago Actions #3 [ruby-core:124791]

I have no strong objection, but it is true that we have big side effect when we allow to_h from Hash() as @Dan0042 (Daniel DeLorme) pointed out for example.
What do you think?

Matz.

Updated by ccmywish (Aoran Zeng) about 1 month ago Actions #4 [ruby-core:124796]

as @Dan0042 (Daniel DeLorme) (Daniel DeLorme) pointed out for example

Just to clarify — maybe I missed something, but it seems there might be a slight misunderstanding. In his comment, Daniel mainly provided historical context about when to_h was introduced and suggested that the current behavior of Hash() was likely an oversight. I couldn't find where he pointed out a "big side effect." If there is a specific example he mentioned, I may have overlooked it; otherwise, could this possibly be referring to a different discussion?


What do you think?

Speaking from my own programming habits, I almost never use conversion functions like Array() or Integer() — I tend to explicitly call .to_xxx methods in a more OO style. In fact, I only discovered that Ruby provides these global-style functions when I was reading the Kernel module documentation.

Given that these functions are provided, I believe they should behave in a predictable and consistent way. Otherwise, users are left to memorize special cases — like the fact that Hash() does not consider to_h, unlike its counterparts.

Regarding the potential side effects of this change: since I don’t widely use these conversion functions myself, it’s hard for me to assess how much impact this would have in the wider community. Perhaps it would be worthwhile to gather some usage feedbacks from other developers.

Updated by mame (Yusuke Endoh) about 1 month ago Actions #5 [ruby-core:124806]

During the dev meeting discussion, @ko1 (Koichi Sasada), not @Dan0042 (Daniel DeLorme), pointed out the following behavior change. It seems Matz got confused about that.

S = Struct.new(:a, :b)
obj = S.new(1, 2)

Hash(obj) #=> current: can't convert S into Hash (TypeError)
          #=> proposal: {a: 1, b: 2}

The behavior changes for objects that only define to_h.

I'm not sure if this is a "big side effect", but the type coversion methods like Kernel#Integer are in principle strict (though this design isn't always strictly enforced), and it may raise an exception for weird input (e.g., Integer("0x1x") raises an exception). Therefore, there might be code that expects Hash(struct) to raise an exception.

Updated by Dan0042 (Daniel DeLorme) about 1 month ago Actions #6 [ruby-core:124814]

Hash(obj) #=> current: can't convert S into Hash (TypeError)
          #=> proposal: {a: 1, b: 2}

This is not really a behavior change but more like a backward-compatible behavior addition/extension

It's really in the same vein as adding a method:

{a:1,b:2}.except(:b) #Ruby 2.7: undefined method `except' for {:a=>1, :b=>2}:Hash (NoMethodError)
                     #Ruby 3.0: {:a=>1}

Or adding an optional parameter:

%w[a b a c].tally({}) #Ruby 3.0: wrong number of arguments (given 1, expected 0) (ArgumentError)
                      #Ruby 3.1 => {"a" => 2, "b" => 1, "c" => 1}

Neither of these can really be considered a "behavior change"

Updated by mame (Yusuke Endoh) about 1 month ago Actions #7 [ruby-core:124817]

If we assume that changing a call that previously raised an exception to return a value is always "a backward-compatible behavior addition," then by that logic, we could also make Kernel#raise return a value without issue. :-)

I said Kernel#Hash is a strict conversion method. There could be users and existing code that rely on it raising an exception for unexpected input.

That being said, I don't have a strong personal opinion on this proposal itself. I just wanted to clarify Matz's comment.

Updated by Dan0042 (Daniel DeLorme) about 1 month ago Actions #8 [ruby-core:124819]

mame (Yusuke Endoh) wrote in #note-7:

I said Kernel#Hash is a strict conversion method.

It's true that Integer() is a strict conversion method, but Array() and String() are notably more lenient, and I've always seen Hash() as more similar to those two. Especially since Hash() already converts nil and [] to an empty hash, I don't think it can be considered a strict conversion method. In fact these two special cases are so weird, it's like Hash() already has partial support for #to_h

There could be users and existing code that rely on it raising an exception for unexpected input.

Normally, "expected input" is something that can be converted to Hash, and "unexpected input" is everything else. If Struct can be converted via Hash() it simply means there's a greater range of valid inputs. I would be very very surprised to see Hash() being used to exclude Struct objects specifically. If a developer wanted strict type validation, they would likely use is_a?(Hash) or Hash.try_convert or RBS or such.

Updated by matz (Yukihiro Matsumoto) 8 days ago 1Actions #9 [ruby-core:125040]

  • Status changed from Open to Feedback

The consistency argument is noted, but I have reservations about
introducing to_h into Hash().

Unlike to_ary/to_a or to_int/to_i, to_h has an unusual
property: it is defined on Enumerable and Array, but whether it
succeeds depends on the content rather than the type of the object.
For example, [[1,2]].to_h succeeds but [1,2,3].to_h raises —
both are Array.

to_hash serves as a reliable signal that an object "is a Hash",
while to_h means "can be converted, maybe". Feeding Hash() with
a method that may raise depending on runtime content makes the
function less predictable, not more.

I understand the desire for consistency, but in this case I think
the asymmetry with Integer()/Array() is intentional by accident
and worth preserving deliberately.

Matz.

Updated by Dan0042 (Daniel DeLorme) 7 days ago · Edited Actions #10 [ruby-core:125054]

matz (Yukihiro Matsumoto) wrote in #note-9:

Feeding Hash() with a method that may raise depending on runtime content makes the function less predictable, not more.

It's important to note that this is already the case; if you pass it an array, Hash() may raise depending on runtime content:

Hash([])      #=> {}
Hash([1,2])   #TypeError
Hash([[1,2]]) #TypeError
#vs
[].to_h       #=> {}
[1,2].to_h    #TypeError
[[1,2]].to_h  #=> {1=>2}

Given that Hash([]) returns {}, I find it highly confusing that Hash([[1,2]]) is an error. If something can be converted to a hash I would expect Hash() to do just that. Using #to_h would make this more consistent, but really this is not so much a matter of consistency but more a matter of usefulness.

It would be fine to keep the current behavior if there is a use case for Hash([[1,2]]) being an error, but really I can't think of any reason anyone would want this.

Actions

Also available in: PDF Atom