Project

General

Profile

Feature #16435

Array#to_proc

Added by zverok (Victor Shepelev) 7 months ago. Updated 6 months ago.

Status:
Rejected
Priority:
Normal
Assignee:
-
Target version:
-
[ruby-core:96338]

Description

The idea is obvious, but I couldn't find it discussed anywhere on tracker before. Please point me at the previous discussions if any.

class Array
  def to_proc
    proc { |v| v.dig(*self) }
  end
end
# Or, alternatively, see about alternatives at the end of proposal:
class Array
  def to_proc
    proc { |v| v[*self] }
  end
end

The implementation seems to provide clean and unambiguous collections indexing in Enumerators:

# Basic objects data, which could be obtained from JSON, CSV, Database...
data = [
  {name: 'John', department: {id: 1, title: 'Engineering'}, salary: 1000}, 
  {name: 'Jane', department: {id: 1, title: 'Engineering'}, salary: 1200},
  {name: 'Boris', department: {id: 2, title: 'Accounting'}, salary: 800},
  {name: 'Alice', department: {id: 3, title: 'Management'}, salary: 1500}
]
data.map(&[:name])
# => ["John", "Jane", "Boris", "Alice"] 
data.min_by(&[:salary])
# => {:name=>"Boris", :department=>{:id=>2, :title=>"Accounting"}, :salary=>800} 
pp data.group_by(&[:department, :title])
# {"Engineering"=>
#   [{:name=>"John",
#     :department=>{:id=>1, :title=>"Engineering"},
#     :salary=>1000},
#    {:name=>"Jane",
#     :department=>{:id=>1, :title=>"Engineering"},
#     :salary=>1200}],
#  "Accounting"=>
#   [{:name=>"Boris",
#     :department=>{:id=>2, :title=>"Accounting"},
#     :salary=>800}],
#  "Management"=>
#   [{:name=>"Alice",
#     :department=>{:id=>3, :title=>"Management"},
#     :salary=>1500}]}

# Works with arrays, too:
data.map(&:values).map(&[0])
# => ["John", "Jane", "Boris", "Alice"]

# And with mixes:
data.group_by(&[:department, :title]).values.map(&[0, :name]) 
# => ["John", "Boris", "Alice"]

Naked structured data seems to be a common enough thing to make working with them easier.

Some prior info:

  • Googling it around, I found the idea was first invented back in 2014, and another one in 2015, not sure if it was proposed on the tracker.
  • Other proposals for Array#to_proc was: to call several methods in sequence 1, 2, and to call method with argument 1, 2, 3, to call several methods in parallel: 1

Honestly, I feel that proposed usage is the most frequently needed.

Also, the readability of the version seems more or less straightforward:

# Existing shortcut, for example:
data.map(&:keys)
# Is equivalent to
data.map { |x| x.keys }
#          ^^^^^ -- "just remove this part"

# Proposed shortcut:
data.map(&[:name])
# Is equivalent to
data.map { |x| x[:name] }
#          ^^^^^ -- "just remove this part"

dig or [] alternative implementations

It is up to discussion (if the whole idea holds water) whether dig should be used or just []. The dig version is convenient for nested structures but slightly breaks "equivalency" shown above, and just [] version will allow this:

data.map(&:values).map(&[1..-1])
# => [[{:id=>1, :title=>"Engineering"}, 1000], [{:id=>1, :title=>"Engineering"}, 1200], [{:id=>2, :title=>"Accounting"}, 800], [{:id=>3, :title=>"Management"}, 1500]]

Maybe, for the sake of explainability, "just []" should be preferred, with digging performed by other means.

Updated by Dan0042 (Daniel DeLorme) 7 months ago

In all honesty, the more time goes, the less I like the various proposals that replace a block by the convert-to-proc & operator.

It looks to me like all the examples could be just as succintly represented with numbered parameters, with the performance advantage of not requiring intermediate Proc objects, and the cognitive advantage of not having to remember what is the behavior of Array#to_proc.

data.map{_1[:name]}
data.min_by{_1[:salary]}
data.group_by{_1.dig(:department, :title)}
data.map{_1.values[0]}
data.group_by{_1.dig(:department, :title)}.values.map{_1.first[:name]}

Although, notice how much better it would have looked with a _ implicit parameter and/or omitted parameter. :-) ;_;

data.map{_[:name]}
data.min_by{_[:salary]}
data.group_by{_.dig(:department, :title)}
data.map{_.values[0]}
data.group_by{_.dig(:department, :title)}.values.map{_.first[:name]}

data.map{.dig(:name)}
data.min_by{.dig(:salary)}
data.group_by{.dig(:department, :title)}
data.map{.values[0]}
data.group_by{.dig(:department, :title)}.values.map{.first[:name]}
#2

Updated by zverok (Victor Shepelev) 6 months ago

Dan0042 (Daniel DeLorme) well, it could be a "Stockholm syndrome" on my side, but I tend to believe thinking "I want something to_proc-able here" leads me to structure code more clearly, and understand "what should be where". (I already expressed this thought in several forms while arguing for "method reference" operator and explaining why I dislike implicit block args.)

Like, generically, when I have somewhere...

[list, of, some, objects].select { |o| check some condition (o) }.map { |o| some mapping (o) }

...and I want to write it prettier and DRYer, the first thing I am thinking is "shouldn't it be a part of o's interface?" (so I can do select(&:condition?)), and then "shouldn't it be some concept in the current scope?" (so I can do at least select(&method(:condition?))). My practical experience and intuition (though, it is hard to argue for or against either of them with simple and unambiguous examples) say that typically it makes code much cleaner, better separated and better testable.

So, it makes me super-sad that now every discussion of "how we can have atomic functional objects" meets with "just use numbered parameters if you want to spare some characters". No, I don't, I can type faster than I think.
I want to spare some concepts, to think less, and to think more clearly.

One absolutely hypothetical example: if you have max_by{_1[:salary]}, and then somebody says "oh, but exclude management from the calculation", it is all too easy to just max_by{_1[:role] == :management ? 0 : _1[:salary]}, and continue this way till you have a very short, very unreadable, very convoluted block of logic. While with max_by(&[:salary]) you'll stop for a second... And maybe filter out role=management earlier, or question the whole logic, or something else. (But I am aware that typically I am arguing against "limitations what you can write, so you'll think better" in Ruby :))

One additional theoretical consideration is: if once we'll have some cool Ruby optimizer, "atomic" statements should be easier analyzable and optimizable (inlineable or something) than "blocks with almost the same number of characters" but several different statements.

Updated by Dan0042 (Daniel DeLorme) 6 months ago

If I understand correctly, you are saying that of these:

list.select(&:condition?)
list.select{.condition?}
list.select{_1.condition?}
list.select{|x|x.condition?}

you consider that the first one provides some kind of clarity of understanding that you don't find in the oher three and that leads to better design?

If that's the case then I'm afraid you've lost me. Apart from niceness of syntax, these four lines are conceptually identical to me. I agree with your example that select { |o| check some condition (o) } would be better structured as select(&:condition?) ... OR the equivalent (just a bit uglier) select{_1.condition?}

I also have an issue with the way you repurpose array literals to play the role of hash/array accessors. That's a hack. It relies on the coincidence that they both use square brackets. What do you expect that max_by(&ary) would mean? If there was such a thing as Array#to_proc I would expect it to act as if the array was a call-able object. I imagine it would be similar to Hash#to_proc, where hash[key] == hash.to_proc[key]. Now that would be a semantically meaningful Array#to_proc.

Updated by zverok (Victor Shepelev) 6 months ago

Dan0042 (Daniel DeLorme)

If I understand correctly, you are saying that of these: ... you consider that the first one provides some kind of clarity of understanding that you don't find in the other three and that leads to better design?

That's an interesting question when put this way. What I believe about the first one that it has the most interesting consequences: it allows to write shorter and more DRY code by introducing new concept (which leads to thinking "where it can lead us with some similar concepts"), while shortcuts like approved _1.foo or rejected {.foo} are just a "sugar".

I believe the bad thing about "just a sugar" not that it is magically "considered harmful" by some higher powers, but that it competes with structural, conceptual enhancements: both can be sold to "pragmatic" language users with "look, it just more convenient", but "sugar" stops there, while new concept leads to more new concepts. And here probably the main division happens: for what I can understand from this tracker and twitter and reddit last years, everybody's sure "we don't need new concepts, Ruby's ideal as a language, it just can use some speedups and maybe a few more shortcuts". I am not in this camp, what can I say.

I also have an issue with the way you repurpose array literals to play the role of hash/array accessors. That's a hack. It relies on the coincidence that they both use square brackets.

That I can somewhat agree with :)
(Funny thing, my mind does jump on "the other side of the road" when I am thinking about it: "yes, this is a hack, but it is nice!")

Updated by Dan0042 (Daniel DeLorme) 6 months ago

zverok (Victor Shepelev) wrote:

What I believe about the first one that it has the most interesting consequences: it allows to write shorter and more DRY code by introducing new concept (which leads to thinking "where it can lead us with some similar concepts"), while shortcuts like approved _1.foo or rejected {.foo} are just a "sugar".

I could sort of accept that if there was actually a new concept in play, but this particular proposal offers nothing more than replicating existing syntax. It's just indirection for the sake of indirection, with no modularity benefit. Again I go back to the example of Hash#to_proc which does bring a useful new concept when you think of map(&lookuphash)

Updated by matz (Yukihiro Matsumoto) 6 months ago

  • Status changed from Open to Rejected

Rejected. Array#to_proc to too generic for queries. It only makes the code more cryptic.

Matz.

Also available in: Atom PDF