Feature #20664
openAdd `before` and `until` options to Enumerator.produce
Description
Enumerator.produce provides a nice way to generate an infinite sequence but is a bit awkward to define how to end a sequence. It lacks a simple and easy way to produce typical finite sequences in an intuitive syntax.
This proposal attempts to solve the problem by adding these two options to the method:
-
before
: when provided, it is used as a predicate to determine if an iteration should end before a generated value gets yielded. -
until
: when provided, it is used as a predicate to determine if an iteration should end until after a generated value gets yielded.
Any value that responds to to_proc
and returns a Proc
object is accepted in these options.
A typical use case for the before
option is traversing a tree structure to iterate over the ancestors or following/preceding siblings of a node.
The until
option can be used when there is a clear definition of the "last" value to yield.
enum = Enumerator.produce(File, before: :nil?, &:superclass)
enum.to_a #=> [File, IO, Object, BasicObject]
enum = Enumerator.produce(3, until: :zero?, &:pred)
enum_to_a #=> [3, 2, 1, 0]
Files
Updated by knu (Akinori MUSHA) 3 months ago
- Related to Feature #14781: Enumerator.generate added
Updated by knu (Akinori MUSHA) 3 months ago
- Related to Feature #20625: Object#chain_of added
Updated by zverok (Victor Shepelev) 3 months ago
I am not sure about this API.
I think in language core there aren’t many APIs that accept just a symbol of a necessary method (only reduce(:+)
comes to mind, and I am still not sure why this form exists, because it seems to have been introduced at the same time when Symbol#to_proc
was, so reduce(:+)
and reduce(&:+)
were always co-existing).
Mostly callables are passed as a block (and therefore there can be only one); but some APIs accept another callable (any object with #call
method, like Enumerator.new).
So, what if condition is not an method of the sequence?.. Should we accept callables, too? Or, what if the method’s user expects it to be a particular value (like until: 0
), or a pattern (like before: 0..1
).
The alternative is
Enumerator.produce(File, &:superclass).take_until(&:nil?)
...which is more or less the same character-count-wise, more powerful (any block can be used), and more atomic.
The one problem we don’t currently have neither Enumerable#take_until
, nor Object#not_nil?
, to write something like
# this wouldn’t work
Enumerator.produce(File, &:superclass).take_while(&:not_nil?)
# though one can use
Enumerator.produce(File, &:superclass).take_while(&:itself)
#=> [File, IO, Object, BasicObject]
...but in general, I suspect adding Enumerable#take_until
to handle such cases (and #take_while_after
while we are on it :)) might be more powerful addition to the language, useful in many situations.
Updated by knu (Akinori MUSHA) 3 months ago
This proposal is based on the potential use cases I have experienced over the years. I've rarely seen a need for infinite sequences that can be defined with produce, and that is why I want to give produce() a feature-complete constructor.
Almost all sequences have had clear and simple end conditions. Traversing a tree structure for ancestor or sibling nodes would be the most typical use case, and the predicates like nil?
and root?
are mostly enough. Type-based conditions and inclusion conditions are not much seen probably because sequences are likely to be homogeneous and there is rarely more than one or a range of terminal values.
Updated by knu (Akinori MUSHA) 3 months ago
These options should take callables in this proposal. Procs and Methods certainly meet the condition: "Any value that responds to to_proc and returns a Proc object is accepted in these options".
The implementation does not bother to call to_proc
on Procs, though.
Updated by matheusrich (Matheus Richard) 3 months ago
The one problem we don’t currently have neither Enumerable#take_until, nor Object#not_nil?, to write something like
After proposing Object#chain_of
, I realized how missing one of these really makes things harder than they need to.
With 3.4's it
, the expression gets a bit more readable:
Enumerator.produce(File, &:superclass).take_while { !it.nil? }
IMO this pattern is common enough to deserve an optimization. #not_nil?
would probably be harder to add (people will start talking about present?
and how it is longer than !<>.nil?
, so maybe proposing #take_until
will be easier to get approval.
Updated by ufuk (Ufuk Kayserilioglu) 3 months ago · Edited
@matheusrich (Matheus Richard) In my opinion take_until
might be an interesting method to add, but I think we are unnecessarily complicating the example with it
and nil?
.
The expression is simply:
Enumerator.produce(File, &:superclass).take_while(&:itself)
and it works perfectly fine and, IMO, is very readable for anyone who knows enough to reach for Enumerator.produce
in the first place.
With respect to the original proposal in this ticket, I also find it a little awkward when Ruby methods take something callable other than blocks, but I understand the pragmatic use of the proposed to_proc
able keyword arguments that would satisfy the majority of cases where one would reach for Enumerator.produce
.
Updated by zverok (Victor Shepelev) 2 months ago
These options should take callables in this proposal. Procs and Methods certainly meet the condition: "Any value that responds to to_proc and returns a Proc object is accepted in these options".
Oh, yeah, sorry, missed this part, focused just on Symbol examples.
Interesting, I don’t think we have any API in core like this—accepting anything to convert it #to_proc
implicitly. Usually when the second callable needed, the agreement is that it should respond to #call
, not to #to_proc
(examples: Proc#>>
, Enumerator#new
).
I am not sure it is “how it should be” (not a lot of APIs like this anyway), but approach with #to_proc
-able things passed in many keyword arguments seems more Rails-ish. Maybe it is time to accept it.
Updated by Eregon (Benoit Daloze) 2 months ago
One issue with take_until
is: does it include the element for which it yielded true
?
In the description example Enumerator.produce(3, until: :zero?, &:pred)
the result does include 0
.
But for Enumerator.produce(parent, &:parent_directory).take_until(&:nil?)
the intention is to not include nil
in the result.
Maybe we should have 2 variants of take_until
, or a keyword argument whether to include the last element or not.
IMO before:/until:
kwargs for Enumerator.produce
feel too ad-hoc.
take_until
is something I wished existed already.
But we need to address whether it includes the last element or not.
Updated by matheusrich (Matheus Richard) 2 months ago
IMO take_until
shouldn't include the element. So the OP example should be:
Enumerator.produce(3, &:pred).take_until(&:negative?)
Updated by Eregon (Benoit Daloze) 2 months ago
I think that makes sense, as an opposite of take_while
:
-
take_while
takes all elements until the block returns falsy, and does not include that element which yielded falsy. -
take_until
takes all elements until the block returns truthy, and does not include that element which yielded truthy.
If clearly documented that probably solves most of the confusion.
So +1 from me to add take_until
.
I wonder if there is value in having variants that do include the element that stops, but at least so far in the linked issues there seems to be no such use-case.
Updated by zverok (Victor Shepelev) 2 months ago · Edited
About does/doesn’t include the last element, there is a related inconclusive discussion: #18136
Basically, I believe that both take_while
/take_until
might have pairs of use cases:
# the last element not necessary
sequence.take_until(&:bad?)
# the last element is necessary: think `.` in a sentence
sequence.take_until(&:terminator?)
# the last element is not necessary
sequence.take_while(&:suitable?)
# the last element is necessary: think the last page on pagination
# (which doesn’t match the condition “have more pages” but is a meaningful element itself)
sequence.take_while(&:has_next_item?)
(We have a #take_while_after
in our core_ext.rb
, and use it regularly.)
The Enumerable#slice_after
and #slice_before
, for example, recognize that it might be either, but not take_while
.
Updated by matheusrich (Matheus Richard) 2 months ago
The kwargs proposed here could be useful:
sequence.take_until(inclusive: true, &:terminator?)
Alternatively, we could always be inclusive and let people pop
to remove the last element:
sequence.take_until(&:terminator?).pop