Project

General

Profile

Feature #19061

Updated by nobu (Nobuyoshi Nakada) about 2 years ago

**The problem** 

 Let's imagine this synthetic data: 
 ```ruby 
 lines = [ 
   "--EMAIL--", 
   "From: zverok.offline@gmail.com", 
   "To; bugs@ruby-lang.org", 
   "Subject: Consuming Enumerators", 
   "", 
   "Here, I am presenting the following proposal.", 
   "Let's talk about consuming enumerators..." 
 ] 
 ``` 
 The logic of parsing it is more or less clear: 
 * skip the first line 
 * take lines until meet empty, to read the header 
 * take the rest of the lines to read the body 

 It can be easily translated into Ruby code, almost literally: 
 ```ruby 
 def parse(enumerator) 
   puts "Testing: #{enumerator.inspect}" 
   enumerator.next 
   p enumerator.take_while { !_1.empty? } 
   p enumerator.to_a 
 end 
 ``` 

 Now, let's try this code with two different enumerators on those lines: 
 ```ruby 
 require 'stringio' 

 enumerator1 = lines.each 
 enumerator2 = StringIO.new(lines.join("\n")).each_line(chomp: true) 

 puts "Array#each" 
 parse(enumerator1) test(enumerator1) 

 puts 
 puts "StringIO#each_line" 
 parse(enumerator2) test(enumerator2) 
 ``` 
 Output (as you probably already guessed): 
 ``` 
 Array#each 
 Testing: #<Enumerator: [...]:each> 
 ["--EMAIL--", "From: zverok.offline@gmail.com", "To; bugs@ruby-lang.org", "Subject: Consuming Enumerators"] 
 ["--EMAIL--", "From: zverok.offline@gmail.com", "To; bugs@ruby-lang.org", "Subject: Consuming Enumerators", "", "Here, I am presenting the following proposal.", "Let's talk about consuming enumerators..."] 

 StringIO#each_line 
 Testing: #<Enumerator: #<StringIO:0x00005581018c50a0>:each_line(chomp: true)> 
 ["From: zverok.offline@gmail.com", "To; bugs@ruby-lang.org", "Subject: Consuming Enumerators"] 
 ["Here, I am presenting the following proposal.", "Let's talk about consuming enumerators..."] 
 ``` 

 Only the second enumerator behaves the way we wanted it to. 
 Things to notice here: 
 1. Both enumerators are of the same class, "just enumerator," but they behave differently: one of them is **consuming** data on each iteration method, the other does not; but there is no programmatic way to tell whether some enumerator instance is consuming 
 2. There is no easy way to **make a non-consuming enumerator behave in a consuming way**, to open a possibility of a sequence of processing "skip this, take that, take the rest" 

 **Concrete proposal** 

 1. Introduce an `Enumerator#consuming?` method that will allow telling one of the other (and make core enumerators like `#each_line` properly report they are consuming). 
 2. Introduce `consuming: true` parameter for `Enumerator.new` so it would be easy for user's code to specify the flag 
 3. Introduce `Enumerator#consuming` method to produce a consuming enumerator from a non-consuming one: 
 ```ruby 
 # reference implementation is trivial: 
 class Enumerator 
   def consuming 
     source = self 
     Enumerator.new { |y| loop { y << source.next } } 
   end 
 end 

 enumerator3 = lines.each.consuming 
 parse(enumerator3) 
 ``` 
 Output: 
 ``` 
 ["From: zverok.offline@gmail.com", "To; bugs@ruby-lang.org", "Subject: Consuming Enumerators"] 
 ["Here, I am presenting the following proposal.", "Let's talk about consuming enumerators..."] 
 ```

Back