Project

General

Profile

Actions

Feature #19061

open

Proposal: make a concept of "consuming enumerator" explicit

Added by zverok (Victor Shepelev) about 2 years ago. Updated almost 2 years ago.

Status:
Open
Assignee:
-
Target version:
-
[ruby-core:110312]

Description

The problem

Let's imagine this synthetic data:

lines = [
  "--EMAIL--",
  "From: zverok.offline@gmail.com",
  "To; bugs@ruby-lang.org",
  "Subject: Consuming Enumerators",
  "",
  "Here, I am presenting the following proposal.",
  "Let's talk about consuming enumerators..."
]

The logic of parsing it is more or less clear:

  • skip the first line
  • take lines until meet empty, to read the header
  • take the rest of the lines to read the body

It can be easily translated into Ruby code, almost literally:

def parse(enumerator)
  puts "Testing: #{enumerator.inspect}"
  enumerator.next
  p enumerator.take_while { !_1.empty? }
  p enumerator.to_a
end

Now, let's try this code with two different enumerators on those lines:

require 'stringio'

enumerator1 = lines.each
enumerator2 = StringIO.new(lines.join("\n")).each_line(chomp: true)

puts "Array#each"
parse(enumerator1)

puts
puts "StringIO#each_line"
parse(enumerator2)

Output (as you probably already guessed):

Array#each
Testing: #<Enumerator: [...]:each>
["--EMAIL--", "From: zverok.offline@gmail.com", "To; bugs@ruby-lang.org", "Subject: Consuming Enumerators"]
["--EMAIL--", "From: zverok.offline@gmail.com", "To; bugs@ruby-lang.org", "Subject: Consuming Enumerators", "", "Here, I am presenting the following proposal.", "Let's talk about consuming enumerators..."]

StringIO#each_line
Testing: #<Enumerator: #<StringIO:0x00005581018c50a0>:each_line(chomp: true)>
["From: zverok.offline@gmail.com", "To; bugs@ruby-lang.org", "Subject: Consuming Enumerators"]
["Here, I am presenting the following proposal.", "Let's talk about consuming enumerators..."]

Only the second enumerator behaves the way we wanted it to.
Things to notice here:

  1. Both enumerators are of the same class, "just enumerator," but they behave differently: one of them is consuming data on each iteration method, the other does not; but there is no programmatic way to tell whether some enumerator instance is consuming
  2. There is no easy way to make a non-consuming enumerator behave in a consuming way, to open a possibility of a sequence of processing "skip this, take that, take the rest"

Concrete proposal

  1. Introduce an Enumerator#consuming? method that will allow telling one of the other (and make core enumerators like #each_line properly report they are consuming).
  2. Introduce consuming: true parameter for Enumerator.new so it would be easy for user's code to specify the flag
  3. Introduce Enumerator#consuming method to produce a consuming enumerator from a non-consuming one:
# reference implementation is trivial:
class Enumerator
  def consuming
    source = self
    Enumerator.new { |y| loop { y << source.next } }
  end
end

enumerator3 = lines.each.consuming
parse(enumerator3)

Output:

["From: zverok.offline@gmail.com", "To; bugs@ruby-lang.org", "Subject: Consuming Enumerators"]
["Here, I am presenting the following proposal.", "Let's talk about consuming enumerators..."]
Actions

Also available in: Atom PDF

Like1
Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like1