Feature #16987

Enumerator::Lazy vs Array methods

Added by zverok (Victor Shepelev) 4 months ago. Updated 3 months ago.

Target version:


Enumerations are designed to be greedy (immediately executed on each method call within a chain) by default. Sometimes, that is not useful for practical purposes (e.g. 2 mln strings array, drop comments, split into fields, find the first ten whose field 2 is equal to some value). So one needs to either do everything in one each block, or use Enumerable#lazy. There are three problems with the latter:

  1. It is much less known,
  2. It is said to be almost always slower than non-lazy, and is therefore not recommended,
  3. It lacks some methods that are often necessary in processing large data chunks.

I want to discuss (3) here. Enumerator::Lazy would better, but actually doesn't, have methods such as: #flatten, #product, and #compact. They are all methods of Array, not Enumerable. In fact,

  1. They probably should belong to Enumerable (none of them requires anything besides #each to function),
  2. They are definitely useful for lazily processing large sequences.

Updated by sawa (Tsuyoshi Sawada) 4 months ago

  • Description updated (diff)

Updated by sawa (Tsuyoshi Sawada) 4 months ago

  • Description updated (diff)

Updated by midnight (Sarun R) 3 months ago

I used Lazy all the time. There is nothing to be done here about its popularity.
FWIW, People knew about it, but choose not to rely on it because they want to support old versions of Ruby.
Hence, it is not very popular in open-source settings.

Regardless of what should be implemented, for now, you can use


as #flatten, and

as #compact.

Only #product is the tricky one that requires multiple operations, but it is not used very often anyway.

What I missed most is #scan.
It is basically a #reduce that yield at every iteration.

Also available in: Atom PDF