Project

General

Profile

Actions

Feature #15381

open

Let double splat call `to_h` implicitly

Added by sawa (Tsuyoshi Sawada) almost 6 years ago. Updated 12 days ago.

Status:
Open
Assignee:
-
Target version:
-
[ruby-core:90304]

Description

The single splat calls to_a implicitly on the object (if it is not an array already) so that, for example, we have the convenience of writing conditions in an array literal:

a = [
  *(:foo if some_condition),
  *(:bar if another_condition),
]

And the ampersand implicitly calls to_proc on the object (if it is not a proc already) so that we can substitute a block with an ampersand followed by a symbol:

some_method(&:some_method_name)

Unlike the single splat and ampersand, the double splat does not seem to implicitly call a corresponding method. I propose that the double splat should call to_h implicitly on the object if it not already a Hash so that we can, for example, write a condition in a hash literal as follows:

h = {
  **({a: 1} if some_condition),
  **({b: 2) if another_condition),
}

There may be some other benefits of this feature that I have not noticed yet.

Updated by sawa (Tsuyoshi Sawada) almost 6 years ago

Sorry, I meant to_h, not to_hash.

And in case my intention was not clear, in the example I gave for the double splat, I expected **nil to be evaluated as **{} due to nil.to_h # => {}.

Updated by shevegen (Robert A. Heiler) almost 6 years ago

I myself have used *foobar quite a lot in ruby code, such as in:

def foobar(*args)
  args.each # and do something then
end

and I have also used &: considerably often too. The most frequent
use case for me personally is to use &: together with .map(). This
is an area where I actually prefer e. g. .map(&:strip) as opposed
to something like .map {|line| line.strip } or something like
that. While I consider the second variant more readable to me,
the &: variant is significantly shorter. (& is not very pretty
though so I try to not use it too often).

I have not yet used **, I think (strangely enough; perhaps I have not
needed it so far). So I can not say much about the proposal itself
either way. I am both clueless and neither pro nor con. :)

I agree with the above reasoning of nil.to_h which makes sense (if
the functionality in itself is approved and I guess we have to ask
matz about this).

I think this is where matz has to decide whether ** should behave as
described, stay as it is (status quo), or have some other (implicit?)
meaning that was not yet mentioned. I really can not say either way,
but I think what is also said in the issue here is that * has a better
defined meaning right now than does **. So this is where I think matz
has to decide either way.

I would recommend adding this suggestion to an upcoming developer meeting,
but perhaps not for 2018 but 2019 instead - last dev meeting this year
should ideally be for the ruby x-mas release. :D

On a side note, does anyone have one or more good or simple use cases
for **? I am trying to find an example for where it may be used but
I do not have any local example.

Last but not least, although I understand that the example given was
mostly to illustrate a point, so that's fine; the * and ** variants
with () and conditionals, are a bit ugly. :P

I understand it is the illustration of an example but I really hope
people don't write code such as "**({a: 1} if some_condition)"; it
takes my brain quite some time to process.

Updated by ted (Ted Johansson) about 5 years ago

I came here to file this feature request, only to find this had already been proposed. This would be beautiful, indeed. :-)

Updated by Eregon (Benoit Daloze) about 5 years ago

nil does not respond to to_hash though, how do you propose to deal with that?

Should ** call to_h rather than to_hash, similar to * calling to_a and not to_ary?

Your comment https://bugs.ruby-lang.org/issues/15381#note-1 shows that, but I read the mailing list and that doesn't include the edit.

Actions #5

Updated by Eregon (Benoit Daloze) about 5 years ago

  • Subject changed from Let double splat call `to_hash` implicitly to Let double splat call `to_h` implicitly
  • Description updated (diff)

Updated by sanjioh (Fabio Sangiovanni) 14 days ago · Edited

Eregon (Benoit Daloze) wrote in #note-4:

Should ** call to_h rather than to_hash, similar to * calling to_a and not to_ary?

Hi, I’m learning Ruby and the fact that ** calls to_hash rather than to_h surprised me; I would expect it to be consistent to * calling to_a.

I suppose that changing this behavior would break too much code, though.

A pointer to the decision making process that lead to using to_hash would be helpful nonetheless.

Thanks!

Updated by zverok (Victor Shepelev) 14 days ago

I believe that the general agreement is that short to_{t} methods (to_s, to_i, to_h, to_a) have a semantics of “have some {type} representation/can be converted to {type}", while long to_{type} ones have a meaning of “are kind-of {type}”. Mostly (though not exclusively), to_{t} is used via explicit calls, while to_{type} converts the object implicitly on various operators.

Now, I believe that using to_h (an explicit conversion method) implicitly on ** would lead to a lot of unintended consequences. Things that typically have to_h but are not intended as “option hashes” are, for example:

  • every Enumerable
  • most of model-like things (from ActiveRecord to mere Struct)

Having them suddenly unpack when nobody intended that (say, typoing **array instead of *array; or by mistake passing model instance instead of some options to a place where it would be handled with **options) might lead to very confusing error messages at the very least, and mysterious, hard to debug behavior in worse cases.

PS: I actually believe that * invoking to_a and not to_ary brings more bad than good (things like Time or Struct can be unpacked when, again, nobody has intended it; it also breaks the mental model of explicit/implicit conversion methods). However, it is obviously too late to fix that.

Updated by Dan0042 (Daniel DeLorme) 13 days ago

to_{t} methods are for explicit type conversion, and to_{type} methods are for implicit type conversion.

{}.merge(obj) => obj is implicitly expected to be Hash, or converted via #to_hash
{}.merge(obj.to_h) => obj is explicitly converted via #to_h
{}.merge(**obj) => obj is explicitly splatted but conversion is done via #to_hash instead of #to_h. It makes no sense, as many many people have commented through the years.

If you're double-splatting an object, of course you'd expect it to be convertible to a Hash. Please assume I'm not an idiot and that **obj is my intent, not a typo. Being overly restrictive (only true Hash-like objects can be used!) serves no purpose. Having to do **obj.to_h is silly, and adding #to_hash to a class in order to make it splattable is even more silly (possibly dangerous).

The fix is easy and backward-compatible; just let double splat convert the object with both #to_hash and #to_h. But instead the "fix" that we got is a weird (although useful) one-off exception for **nil. 🙄

Updated by zverok (Victor Shepelev) 13 days ago

{}.merge(**obj) => obj is explicitly splatted but conversion is done via #to_hash instead of #to_h. It makes no sense, as many many people have commented through the years.

I don’t think it can be treated as “explicitly” (invoking hash conversion); it is rather “we apply an operator, which would work with kind-of hash”. We can apply the same argument to 1 + obj: “you see, I am explicitly summing it with an integer; why is it not converted with to_int?”

**obj is my intent, not a typo.

What if it is a typo/error? Assume this API (frequent in many long-living codebases, which are partially old-style “options hash” and partially new-style “keyword arguments”):

def process(something, options = {})
  # ...
  other_method(**options)
end

Now, if the user misuses it (by mistake or misunderstanding) and passes some, say, ActiveRecord model as a second parameter, then, with to_h, it would be successfully unpacked, and might go this way (as “keyword arguments” which shouldn’t have been) through several more delegating methods before failing in completely different place—with an extremely hard to debug problem.

So, what we are weighting here are:

  • convenience of unpacking non-hashes into keyword arguments (BTW, what exactly is the use case for this convenience?..), vs.
  • problem of things being unintentionally unpacked, considering how many objects have to_h method; basically failing to “fail early” (where the mistake was made) and instead passing the mistake for who-know-which depth.

I honestly fail to see the use case for to_h being used with ** that outweighs the possible problems (other than **(params if condition?), which was actually handled).

Updated by Dan0042 (Daniel DeLorme) 13 days ago

I don’t think it can be treated as “explicitly”

The ** is right there in the code, written out caller-side, you can't get more "explicit" than that.

We can apply the same argument to 1 + obj

No we can't; as you know + is a method (I guess that would make ** a "true" operator?) so 1.+(obj) is similar to {}.merge(obj) and falls under implicit conversion (which the method is free to do or not)

What if it is a typo/error?

Then I'll fix my stupid mistakes by my own stupid self, thank you very much.

Now, if the user misuses it (by mistake or misunderstanding) and passes some, say, ActiveRecord model as a second parameter, then, with to_h, it would be successfully unpacked

I think that would be awesome. If I do other_method(**model) and that model is representable as a hash, passing it as keyword arguments is beautiful.
Imagine something like

ValidOptions = Struct.new(:ssl, :host, :port)
opts = ValidOptions.new
opts.port = 999
setup_request(**opts)
  • problem of things being unintentionally unpacked, considering how many objects have to_h method

That never happens. I have never ever written code where foo(**opts) throws "no implicit conversion of Object into Hash" and then I realize I really meant to use something other than opts.

(other than **(params if condition?), which was actually handled).

I'll admit that **nil covers 90% of the benefits, but the adhoc-ness of it all is a bit frustrating.

Updated by jeremyevans0 (Jeremy Evans) 13 days ago

There are certainly backwards compatibility issues from changing ** from calling to_hash to to_h. I'm sympathetic to the argument that calling to_h fits better, as the conversion is explicit and not implicit, but the backwards compatibility costs are probably too high. Maybe in Ruby 4?

One possible approach is defining **@ as an operator method, something like:

class BasicObject
  def **@ = to_hash
end

Using the unary ** operator on an object would call the **@ method, which should return a hash (or raise an exception). The method name itself is designed to be similar to +@ and -@, which are called when you use the unary + and - operators. Users could then change the **@ method to call to_h instead of to_hash if they want, and get the explicit conversion behavior (either for all objects, or for specific objects/classes).

I proposed *@ as the method called by the unary * operator in #2013, with a working patch, before ** was introduced for keywords. This was the first issue I filed, back in 2009. It was eventually rejected as there was not much discussion on it, but it was still thought an interesting idea at the time it was rejected.

Updated by Dan0042 (Daniel DeLorme) 13 days ago

There are certainly backwards compatibility issues from changing ** from calling to_hash to to_h.

What kind of backwards compatibility issues exactly? My idea was to call #to_hash and fall back to #to_h, that should ensure no backwards incompatibility at all, unless I'm missing something?

Using the unary ** operator on an object would call the **@ method, which should return a hash (or raise an exception).

That's a very intriguing solution, and very elegant I have to say.

Updated by zverok (Victor Shepelev) 13 days ago

I think that would be awesome. If I do other_method(**model) and that model is representable as a hash, passing it as keyword arguments is beautiful.

...until you passed it erroneously, and it was never meant to be, and the error happened not where it should happen but in some completely different place.

...until somebody tries to make heads or tails from the legacy codebase and assumes that something that is invoked with **value is hash-like, while in fact, it was a value object, and there is no way to know it without chasing it back and forth by call stack.

Now, imagine if that was the author’s intention (“here we unpack our model into the keyword args”) and the author just wrote (like, five more characters)

other_method(**model.to_h)

...and the reader immediately sees where the “transition point” is from value object/model to keyword args/hash-like data.

Imagine something like

ValidOptions = Struct.new(:ssl, :host, :port)

Now imagine that if you want to have such options as kwarg-passable, you can just

ValidOptions = Struct.new(:ssl, :host, :port) { alias to_hash to_h }

...and that immediately communicates “this particular value object (unlike many other value objects) thinks of it as a hash-like object that would be unpacked probably at some point”.

problem of things being unintentionally unpacked, considering how many objects have to_h method

That never happens. I have never ever written code where foo(**opts) throws "no implicit conversion of Object into Hash" and then I realize I really meant to use something other than opts.

I saw this a lot (especially transitioning from older to newer syntaxes, libraries, and Ruby versions). Fighting anecdata with anecdata! :)

The conversion from “it had just one extra parameter” to “it has a hash of extra parameters” is a frequent refactoring when some library matures, and I really like to have it reported to me early that some object passed where it doesn’t correspond to the target method’s expectations. NoMethodError/“no implicit conversion” is a wonderful tool for this, but only if it happens in the closest place to the possible error.

PS: Honestly, I am still not sure which problem we are trying to solve (that is not covered by the **nil addition) that is so frequent and has no other easy solution to be worth all the drawbacks.

Updated by Eregon (Benoit Daloze) 12 days ago

Regarding **@, I'm not a big fan because ** on the caller side is very much syntax a bit like a keyword, and it has the meaning the pass the Hash/expression as keyword arguments and not as a position argument.
Of course you could have ** the syntax and **@ the unary operator called by it, but effectively it's not really an operator in the sense of unary +/- or binary operators, e.g. it can only be used in calls and in Hash literals.

Regarding trying to_hash then to_h or vice-versa it will have some overhead for the second method being tried for non-Hash.

foo(**obj.to_h) doesn't sound too bad to me as I would expect it's needed in only a few places (compared to all **hash usages) and it's probably clearer what is happening that way if obj is not a Hash/to_hash.

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like1