Feature #19061
Updated by nobu (Nobuyoshi Nakada) about 2 years ago
**The problem** Let's imagine this synthetic data: ```ruby lines = [ "--EMAIL--", "From: zverok.offline@gmail.com", "To; bugs@ruby-lang.org", "Subject: Consuming Enumerators", "", "Here, I am presenting the following proposal.", "Let's talk about consuming enumerators..." ] ``` The logic of parsing it is more or less clear: * skip the first line * take lines until meet empty, to read the header * take the rest of the lines to read the body It can be easily translated into Ruby code, almost literally: ```ruby def parse(enumerator) puts "Testing: #{enumerator.inspect}" enumerator.next p enumerator.take_while { !_1.empty? } p enumerator.to_a end ``` Now, let's try this code with two different enumerators on those lines: ```ruby require 'stringio' enumerator1 = lines.each enumerator2 = StringIO.new(lines.join("\n")).each_line(chomp: true) puts "Array#each" parse(enumerator1) test(enumerator1) puts puts "StringIO#each_line" parse(enumerator2) test(enumerator2) ``` Output (as you probably already guessed): ``` Array#each Testing: #<Enumerator: [...]:each> ["--EMAIL--", "From: zverok.offline@gmail.com", "To; bugs@ruby-lang.org", "Subject: Consuming Enumerators"] ["--EMAIL--", "From: zverok.offline@gmail.com", "To; bugs@ruby-lang.org", "Subject: Consuming Enumerators", "", "Here, I am presenting the following proposal.", "Let's talk about consuming enumerators..."] StringIO#each_line Testing: #<Enumerator: #<StringIO:0x00005581018c50a0>:each_line(chomp: true)> ["From: zverok.offline@gmail.com", "To; bugs@ruby-lang.org", "Subject: Consuming Enumerators"] ["Here, I am presenting the following proposal.", "Let's talk about consuming enumerators..."] ``` Only the second enumerator behaves the way we wanted it to. Things to notice here: 1. Both enumerators are of the same class, "just enumerator," but they behave differently: one of them is **consuming** data on each iteration method, the other does not; but there is no programmatic way to tell whether some enumerator instance is consuming 2. There is no easy way to **make a non-consuming enumerator behave in a consuming way**, to open a possibility of a sequence of processing "skip this, take that, take the rest" **Concrete proposal** 1. Introduce an `Enumerator#consuming?` method that will allow telling one of the other (and make core enumerators like `#each_line` properly report they are consuming). 2. Introduce `consuming: true` parameter for `Enumerator.new` so it would be easy for user's code to specify the flag 3. Introduce `Enumerator#consuming` method to produce a consuming enumerator from a non-consuming one: ```ruby # reference implementation is trivial: class Enumerator def consuming source = self Enumerator.new { |y| loop { y << source.next } } end end enumerator3 = lines.each.consuming parse(enumerator3) ``` Output: ``` ["From: zverok.offline@gmail.com", "To; bugs@ruby-lang.org", "Subject: Consuming Enumerators"] ["Here, I am presenting the following proposal.", "Let's talk about consuming enumerators..."] ```