Feature #18181
openIntroduce Enumerable#min_with_value, max_with_value, and minmax_with_value
Description
PR is https://github.com/ruby/ruby/pull/4874
I propose Enumerable#min_with_value
, max_with_value
and minmax_with_value
.
These methods work like this:
%w(abcde fg hijk).min_with_value { |e| e.size } # => ['fg', 2]
%w(abcde fg hijk).max_with_value { |e| e.size } # => ['abcde', 5]
%w(abcde fg hijk).minmax_with_value { |e| e.size } # => [['fg', 2], ['abcde', 5]]
Corresponding to #min(n)
, an integer argument can be passed to #min_with_value
or #max_with_value
.
%w(abcde fg hijk).min_with_value(2) { |e| e.size } # => [['fg', 2], ['hijk', 4]]
%w(abcde fg hijk).max_with_value(2) { |e| e.size } # => [['abcde', 5], ['hijk', 4]]
Motivation¶
When I use Enumerable#min_by
, I sometimes want to get not only the minimum element
but also the value from the given block.
(e.g.: There are many points. Find the nearest point and get distance to it.)
elem = enum.min_by { |e| foo(e) }
value = foo(elem)
This works, but I'd like to avoid writing foo() twice. (Consider a more complex case.)
This can be written without repeated foo() like belows, but it is slightly complicated and needs extra arrays.
value, elem = enum.map { |e| [foo(e), e] }.min_by(&:first)
If the size of enum is enormous, it is hard to use intermediate arrays.
Enumerable#min_with_value
solves this problem.
I think min_with_value
is the best name I could think of, but any suggestions for better names would be appreciated.
Benchmark¶
https://bugs.ruby-lang.org/issues/18181#note-2
Example¶
Solving a traveling salesman problem in nearest neighbor algorithm.
require 'set'
Point = Struct.new(:x, :y)
points = Set.new([Point.new(1, 1), Point.new(2, 4), Point.new(3, 3), Point.new(2, 2), Point.new(0, 1)])
total = 0
current = points.first
points.delete(current)
path = [current]
until points.empty?
current, distance = points.min_with_value do |point|
Math.hypot(current.x - point.x, current.y - point.y)
end
total += distance
points.delete(current)
path << current
end
p path
p total
Updated by sawa (Tsuyoshi Sawada) over 3 years ago
This is a frequently seen use case. I use code like this:
%w(abcde fg hi jkl mn).group_by(&:size).min # => [2, ["fg", "hi", "mn"]]
and I don't see a strong need to shorten it into a single method although I am not strongly against doing so. However, I have some concerns with your proposal:
-
Its use will be limited if it only picks a single element when the min/max element is not unique. I propose that it should rather hold an array of the corresponding elements.
-
I don't think the word "value" is appropriate. By this word, what comes to mind most naturally is the min/max value rather than the enumerated element. Perhaps a word like "element(s)" should be used.
-
I don't think the order of the elements you proposed is appropriate. With
each_with_object
, the enumerated entity is followed by the accumulating object, i.e., the word order matches the order of the elements within the subarray. I think you should follow this practice. -
The methods
min
/max
are used without a block,min_by
/max_by
are used with a block. And in this case, you are using a block.
Thus, I think your proposal should be modified into something like the following (which looks so similar to what can be done already as I have shown above):
%w(abcde fg hi jkl mn).min_by_with_elements(&:size) # => [2, ["fg", "hi", "mn"]]
Updated by kyanagi (Kouhei Yanagita) over 3 years ago
enum.group_by { ... }
holds all of the elements of enum,
so it consumes a lot of memory and is very slow if enum is enormous.
require 'benchmark'
Benchmark.bm(16) do |x|
enum = -1000000..1000000
x.report("group_by") { enum.group_by { |i| i.abs }.min }
x.report("map.min_by") { enum.map { |i| [i.abs, i] }.min_by(&:first) }
x.report("min_with_value") { enum.min_with_value { |i| i.abs } }
x.report("each") { enum.each { |i| i.abs } }
end
user system total real
group_by 1.238664 0.243779 1.482443 ( 1.483547)
map.min_by 0.721992 0.047738 0.769730 ( 0.770767)
min_with_value 0.236433 0.004105 0.240538 ( 0.240596)
each 0.150815 0.000000 0.150815 ( 0.151704)
#min
and #min_by
return single element even through there are multiple smallest elements.
There may be cases where it is useful to return multiple elements,
but single element will be enough for most cases.
Writing elements.first
everytime is a little bit of a pain.
Besides, size of the elements may become very large. (if almost all of the elements are mapped into the same value)
I think #min_with_value
should return the element as primary (i.e. the element is first, the block's value is second)
because #min
and #min_by
return the element.
Additionally, this is similar to the way #each_with_index
and #each_with_object
pass their block arguments.
enum.each_with_index do |elem, i| # <- the element is first, the other is second.
end
enum.each_with_object([]) do |elem, obj| # <- the element is first, the other is second.
end
Naming issue:
The behavior of this method is described as "The minimum element by the block value and its value".
Based on this, the name will be something like min_element_by_value_and_value
or min_element_by_value_with_value
if verbosely named.
Considering the method #min
which returns the element, the word "element" is thought to be able to omitted.
So I suggest the short name #min_with_value
. (I think #min_by_with_value
is also possible.)
The word "with" implies that "min(_element)" and "value" are different objects.
Updated by kyanagi (Kouhei Yanagita) over 3 years ago
- Description updated (diff)
Updated by kyanagi (Kouhei Yanagita) over 3 years ago
I added an example code using #min_with_value
.
Updated by Dan0042 (Daniel DeLorme) over 3 years ago
I like this proposal, but I think @sawa's idea has value also. It would be simple to accommodate both uses with a single syntax:
%w(abcde fg xx hijk).min_with_value(&:size) # => [2, 'fg', 'xx']
v,elem = %w(abcde fg xx hijk).min_with_value(&:size) #in the normal case where you only want the 'first' minimum
v,*elems = %w(abcde fg xx hijk).min_with_value(&:size) #when you want all of them
Updated by kyanagi (Kouhei Yanagita) over 3 years ago
- Description updated (diff)
Updated by kyanagi (Kouhei Yanagita) over 3 years ago
Description about #min_with_value(n)
added. It corresponds to #min(n)
.
Updated by Eregon (Benoit Daloze) about 3 years ago
:+1: I've needed this several times.
I think the semantics in the description are the best and clearest.
Updated by matz (Yukihiro Matsumoto) about 3 years ago
Just for confirmation, is it OK that min_with_value
returns only a single value?
I don't like the name _with_value
. It explains that the method returns an additional value with the minimum/maximum value, but no additional information on the value.
Matz.
Updated by Dan0042 (Daniel DeLorme) about 3 years ago
What about min_with
?
%w(abcde fg hijk).min_with(&:size) # => ['fg', 2]
# ^min ^size
Updated by cvss (Kirill Vechera) about 3 years ago
There's also a frequent similar problem with #find
when you need to find the first matched value instead of entry. But since it involves two semantically different code parts, it a bit more complex and cannot be implemented nicely with the same approach:
%w(abcde fg hijk).find_with_value { |e| (size = e.size).even? && size } # => ['fg', 2]
We can create some lazy mapping enumerator providing pairs [entry, value]
, which we unroll in the second chain just as much as we need and without repeating the computation of value:
module Enumerable
def with_map &block
map{|e| [e, block.call(e)]}
end
end
%w(abcde fg hijk).lazy.with_map(&:size).find{_2.even?} # => ['fg', 2]
Back to #min_with_value
, it could be written as:
%w(abcde fg hijk).lazy.with_map(&:size).min_by(&:last) # => ['fg', 2]
We can add some shorthand methods specific to this enumerator like these:
%w(abcde fg hijk).lazy.with_map(&:size).min_by_value # => ['fg', 2]
%w(abcde fg hijk).lazy.with_map(&:size).min # => ['fg', 2] - #min is overwritten as an alias to min_by_value
%w(abcde fg hijk).lazy.with_map(&:size).find_by_value(&:even?) # => ['fg', 2]
%w(abcde fg hijk).lazy.with_map(&:size).find(&:even?) # => ['fg', 2] - #find is overwritten as an alias to find_by_value
Updated by kyanagi (Kouhei Yanagita) about 3 years ago
matz (Yukihiro Matsumoto) wrote in #note-9:
Just for confirmation, is it OK that
min_with_value
returns only a single value?
I think that returning a single value will suit most cases.
It matches that min
also returns a single value.
But if it is considered that returning multiple values is better, I will not disagree with it.
Updated by ko1 (Koichi Sasada) about 3 years ago
Another idea (evaluate a given block with a result):
class Enumerator
def with_value(&b)
v = each(&b)
[v, b.call(v)]
end
end
pp %w(abcde fg hijk).min_by.with_value{|e| e.size}
#=> ["fg", 2]
It is general for ..._by
methods. If given block has side-effect or consumes huge time, it is not acceptable, though.
Updated by shan (Shannon Skipper) almost 3 years ago
For a single value, a "then" suffix might be nice.
%w(abcde fg hijk).min_by(&:size)
#=> "fg"
%w(abcde fg hijk).min_by_then(&:size)
#=> 2
It mirrors #min_by and #then with the same block.
%w(abcde fg hijk).min_by(&:size).then(&:size)
#=> 2
Updated by baweaver (Brandon Weaver) almost 3 years ago
I like the more generic suggestion from @ko1 (Koichi Sasada) on with_value
, I had a similar idea before reading it when @shan (Shannon Skipper) pinged me about this ticket.
The other potential is using the _map
suffix as is the case with filter_map
:
%w(abcde fg hijk).min_map { |e| e.size }
# => ['fg', 2]
%w(abcde fg hijk).max_map { |e| e.size }
# => ['abcde', 5]
%w(abcde fg hijk).minmax_map { |e| e.size }
# => [['fg', 2], ['abcde', 5]]
%w(abcde fg hijk).min_map(2) { |e| e.size }
# => [['fg', 2], ['hijk', 4]]
%w(abcde fg hijk).max_map(2) { |e| e.size }
# => [['abcde', 5], ['hijk', 4]]
This has precedence, though I do wonder how many functions we accumulate in Enumerable that are effectively the composition of two or more Enumerable methods.
More musing, and in a general sense, I'm reminded of Elixir extracting Enumerable-like methods into Enum
and allowing potential composition and the rejected prior syntax of Enumerable.:method
which ties into a fairly old post on my hopes for the also rejected pipeline operator that was proposed.
I may write a bit on that general idea, but I do see great value in the general concept of min_with_value
as I myself have needed it a few times in the past.
Updated by kyanagi (Kouhei Yanagita) almost 3 years ago
Another name proposal:
When I write array.min_with_value { foo }
, I mean "minimize foo with an element in array".
So array.minimizing { foo }
(and maximizing
) may be the candidate.
Updated by sawa (Tsuyoshi Sawada) almost 2 years ago
Related to #17097