Bug #21780
openChange the default size of Enumerator.produce back to infinity
Added by zverok (Victor Shepelev) 2 days ago. Updated about 1 hour ago.
Description
In #21701 a new argument size: was introduced, and its default value is nil (unknown).
While I support the new argument, I'd argue that the default should be Float::INFINITY.
Reasoning: By design, Enumerator.produce is infinite (there is no internal condition to stop iteration), and the simplest, most straightforward usages of the method would produce definitely infinite iterators, which the user than can limit with take, or take_while or similar methods.
To produce the enumerator that will stop by itself requires explicit raising of StopIteration, which I expect to be a (slightly) advanced technique, and those who use it might be more inclined to provide additional arguments to clarify the semantics.
While Enumerator#size is hardly frequently used now (other than in #to_set, which started the discussion), it might be in the future, and I believe it is better to stick with more user-friendly defaults.
Now:
# very trivial enumerator, but if you want it to have "proper" size, you need
# to not forget to use an elaborate argument and type additional 21 characters
Enumerator.produce(1, size: Float::INFINITY, &:succ)
# already non-trivial enumerator, which is hardly frequently used, but the
# current defaults correspond to its semantics:
Enumerator.produce(Date.today) {
raise StopIteration if it.tuesday? && it.day.odd?
it + 1
}
With my proposal:
# trivial, most widespread case:
Enumerator.produce(1, &:succ).size #=> Infinity
# non-trivial case, with the enumerator designer clarifying their
# intention that "we are sure it stops somewhere":
Enumerator.produce(Date.today, size: nil) {
raise StopIteration if it.tuesday? && it.day.odd?
it + 1
}
Updated by Eregon (Benoit Daloze) 2 days ago
· Edited
Actions
#1
[ruby-core:124194]
I disagree on this one, as written on https://bugs.ruby-lang.org/issues/21701#note-3
I think Enumerator#size should only be non-nil when it is known to be the exact size.
In this case it is not known if it is infinite, so returning Float::INFINITY for the size is "wrong".
One use case I know of for Enumerator#size is to do a progress bar while iterating the Enumerator.
That can only work reliably if the non-nil size is the exact size.
Returning Float::INFINITY when it is not would be misleading, though of course returning nil won't give the actual size, which might simply be not known.
BTW, your examples use Enumerator.new but the text seems to be about Enumerator.produce.
I think either way it applies to both the same way though.
Updated by Eregon (Benoit Daloze) 2 days ago
Actions
#2
[ruby-core:124195]
Actually for Enumerator.new, it's trivial to not be infinite and does not even need StopIteration, e.g.:
Enumerator.new { |y| y << 1 }.count # => 1
So I guess you meant to use Enumerator.produce instead in your examples above.
Updated by zverok (Victor Shepelev) 2 days ago
· Edited
Actions
#3
[ruby-core:124196]
- Description updated (diff)
I think
Enumerator#sizeshould only be non-nil when it is known to be the exact size.
In this case it is not know if it is infinite, so returningFloat::INFINITYfor the size is "wrong".
I would argue that it is known to be infinite: that's how produce works: loops infinitely, unless explicitly stopped by an exception, there is no other way than an exceptional one (while this might seem to be a dumb pun, I actually think that we have a useful distinction here).
So I would argue that the default expectation of the user to "not think about it and trust Ruby to do the sane thing", and the sane thing is "produce is infinite unless you raise that specific exception" (even break wouldn't work... which is kinda unpleasant, but a discussion for another time).
In a rare situation when they'd question the behavior, there is a clearly documented way to adjust it.
BTW, your examples use
Enumerator.newbut the text seems to be aboutEnumerator.produce.
Yes, thank you, fixed. The title said what I meant but the code was broken, sorry.
Updated by zverok (Victor Shepelev) 2 days ago
Actions
#4
[ruby-core:124199]
There are, by the way, other effects of the current default that are, even if minor, still annoying:
Enumerator.produce(1, &:succ).lazy.take(6).size
# Ruby 3.4: => 6 -- which is correct and useful
# Ruby 4.0: => nil -- which is ... less useful
Updated by mame (Yusuke Endoh) about 13 hours ago
Actions
#5
- Related to Feature #21701: Enumerator.produce accepts an optional `size` keyword argument added
Updated by knu (Akinori MUSHA) about 12 hours ago
Actions
#6
[ruby-core:124223]
The argument that Enumerator.produce is infinite by nature is certainly valid. However, the change that made Enumerator#to_set refuse to operate when the size returns infinity introduced a compatibility issue: it breaks existing code that relies on calling #to_set on Enumerator.produce (that the programmer knows is finite) being possible.
Redefining Enumerator::Produce#to_set to ignore the size is one way, but that would be awkward and incorrect in the long term. This decision was made after balancing backward compatibility against what the default size of produce() should be.
Updated by zverok (Victor Shepelev) about 11 hours ago
Actions
#7
[ruby-core:124224]
However, the change that made Enumerator#to_set refuse to operate when the size returns infinity introduced a compatibility issue
TBH, I don't see the compatibility argument applied with any consistency here.
Let's imagine several cases:
-
Somebody relies on constructing elaborate
Enumerator.produce-based enumerators that throwStopIterationto terminate (instead of using simpler techniques), and then applies#to_setto them. In this feature, we are keeping compatibility for them. -
Somebody uses
Enumerator.producealongside other types of enumerators. In some branch of their code, they doraise "Can't do this operation" if enum.size == Float::INFINITY. The compatibility is broken for them. -
Somebody relies on
Enumerator.produce { ... }.take(5)to have non-nilsize, throwing it around as a duck-typed array. The compatibility is broken for them. -
(Just to expand the scope of possible compatibility studies) Somebody might've suddenly had code like this, and it is now also broken:
Enumerator.produce(size: 5) { it.merge(size: it[:size] + 1) }.take(8) #=> [{size: 5}, {size: 6}, {size: 7}, {size: 8}, {size: 9}, {size: 10}, {size: 11}, {size: 12}]
Intuitively, I would say that (2) is the most basic case that shouldn't be broken; (3) is a (weak) evidence to the same; while both (1) and (4) are both a "collateral damage" that should be accepted (because if we treat compatibility with any real rigor, no changes should be made at all, "any change breaks somebody's usecase").
Is there any study that I am not aware of that says that (1) is the widespread case and breaking it will outrage a huge part of the community, while breaking 2-3 (as well as the general semantics of the method) is negligible?
Or maybe there is some evidence/discussion that it is authors of elaborate enumerators who wouldn't understand a very small (and well-explained semantically) change to fix the incompatibility, while those who would expect this enumerator to be infinite should just swallow it and add size: Float::INFINITY maybe in a dozen places in their code?
What am I missing here?
Updated by knu (Akinori MUSHA) about 6 hours ago
· Edited
Actions
#8
[ruby-core:124236]
zverok (Victor Shepelev) wrote in #note-7:
TBH, I don't see the compatibility argument applied with any consistency here.
Let's imagine several cases:
- Somebody relies on constructing elaborate
Enumerator.produce-based enumerators that throwStopIterationto terminate (instead of using simpler techniques), and then applies#to_setto them. In this feature, we are keeping compatibility for them.
This is where you are mistaken.
-
Ruby 3.4.7
% ruby -ve 'e=Enumerator.produce(1) {it>=3 ? raise(StopIteration) : it+1}; p e.to_set' ruby 3.4.7 (2025-10-08 revision 7a5688e2a2) +PRISM [arm64-darwin25] #<Set: {1, 2, 3}> -
Ruby master
% ruby -ve 'e=Enumerator.produce(1) {it>=3 ? raise(StopIteration) : it+1}; p e.to_set' ruby 4.0.0dev (2025-12-16T10:52:45Z master 2b1a9afbfb) +PRISM [arm64-darwin25] Set[1, 2, 3](compatible; ignoring the string representation difference)
-
Ruby master with 79a6ec74831cc47d022b86dfabe3c774eaaf91ca reverted
% ruby -ve 'e=Enumerator.produce(1) {it>=3 ? raise(StopIteration) : it+1}; p e.to_set' ruby 4.0.0dev (2025-12-16T10:52:45Z master 2b1a9afbfb) +PRISM [arm64-darwin25] -e:1:in 'Enumerator#to_set': cannot convert an infinite enumerator to a set (ArgumentError) from -e:1:in '<main>'(compatibility broken)
Updated by knu (Akinori MUSHA) about 5 hours ago
Actions
#9
[ruby-core:124237]
I may have misread your comment. Let me review again.
Updated by zverok (Victor Shepelev) about 5 hours ago
Actions
#10
[ruby-core:124238]
This is where you are mistaken.
No, that's exactly what I've meant:
# case 1:
e = Enumerator.produce(1, size: Float::INFINITY) {it>=3 ? raise(StopIteration) : it+1}
p e.to_set
# ruby 3.4: #<Set: {1, 2, 3}>
# master: Set[1, 2, 3]
# master with my proposed fix (size: Infinity by default):
# cannot convert an infinite enumerator to a set (ArgumentError) -- compatibility broken
# fix for the code authors:
e = Enumerator.produce(1, size: nil) {it>=3 ? raise(StopIteration) : it+1} # -- clear change, easy to explain
# Another possible fix, "come to think about it, I don't need StopIteration!"
e = Enumerator.produce(1) { it + 1 }.take_while { it <= 3 }.to_set #=> Set[1, 2, 3]
# case 2:
e = Enumerator.produce(1, &:succ)
if e.size == Float::INFINITY
puts "Early stop processing!" # in reality, probably raise/early return
else
puts "Continue processing"
end
# ruby 3.4: "Early stop processing"
# master: "Continue processing" -- compatibility broken
# fix for the code authors:
Enumerator.produce(1, size: Float::INFINITY, &:succ) # -- ugly change to trivial code
# master with my proposed fix (size: Infinity by default): "Early stop processing" -- compatibility OK
# case 3:
e = Enumerator.produce(1, &:succ).lazy.take(6)
p e.size
# ruby 3.4: 6
# master: nil -- compatibility broken
# master with my proposed fix: 6 -- compatibility OK
Why do we consider that case (1) of those is the one where we preserve compatibility, while two other should be broken, and not vice versa?
Updated by Eregon (Benoit Daloze) about 5 hours ago
· Edited
Actions
#11
[ruby-core:124241]
It's good points about compatibility, I agree losing the size on .take is not good.
IMO trying to detect "infinite loop" (in https://bugs.ruby-lang.org/issues/21654) is kinda pointless, it's the halting problem, for the vast majority of cases we cannot know.
Enumerators which have a .size => Float::INFINITY may or may not be infinite, they might use break (and yet not adjust .size), throw an exception (can't be fully predicted and always possible), etc.
So I think a good fix would be to keep Enumerator.produce.size as Float::INFINITY, and stop assuming that enum.size.infinite? means infinite loop to iterate since it's not guaranteed (similar thinking as in this comment).
Updated by Eregon (Benoit Daloze) about 5 hours ago
Actions
#12
- Related to Bug #21654: Set#new calls extra methods compared to previous versions added
Updated by knu (Akinori MUSHA) about 4 hours ago
· Edited
Actions
#13
[ruby-core:124243]
zverok (Victor Shepelev) wrote in #note-7:
- Somebody uses
Enumerator.producealongside other types of enumerators. In some branch of their code, they doraise "Can't do this operation" if enum.size == Float::INFINITY. The compatibility is broken for them.
I consider that kind of usage as a safeguard that usually works in development rather than something that should be relied upon in production code. However, the same goes for the change in Enumerator#to_set, so we might want to consider reverting it in the first place.
- Somebody relies on
Enumerator.produce { ... }.take(5)to have non-nilsize, throwing it around as a duck-typed array. The compatibility is broken for them.
Enumerator.produce { ... }.take(5) returns an array. Did you mean Enumerator.produce { ... }.lazy.take(5)? If so, then yes, the compatibility is broken for them.
- (Just to expand the scope of possible compatibility studies) Somebody might've suddenly had code like this, and it is now also broken:
Enumerator.produce(size: 5) { it.merge(size: it[:size] + 1) }.take(8) #=> [{size: 5}, {size: 6}, {size: 7}, {size: 8}, {size: 9}, {size: 10}, {size: 11}, {size: 12}]
This should have never been allowed. It was my mistake I did not prohibit it.
Intuitively, I would say that (2) is the most basic case that shouldn't be broken; (3) is a (weak) evidence to the same; while both (1) and (4) are both a "collateral damage" that should be accepted (because if we treat compatibility with any real rigor, no changes should be made at all, "any change breaks somebody's usecase").
I think (3) is more important than (2), and I'm not absolutely sure about how (1) and (3) compare in importance. If the concern about (1) could be eliminated by reverting Enumerator#to_set, we would be able to save both.
Is there any study that I am not aware of that says that (1) is the widespread case and breaking it will outrage a huge part of the community, while breaking 2-3 (as well as the general semantics of the method) is negligible?
Or maybe there is some evidence/discussion that it is authors of elaborate enumerators who wouldn't understand a very small (and well-explained semantically) change to fix the incompatibility, while those who would expect this enumerator to be infinite should just swallow it and add
size: Float::INFINITYmaybe in a dozen places in their code?
I don't have any data about how widespread each of these use cases is, or even how Enumerator.produce is used in production code. I just know Enumerator.produce is still immature and would love to hear from more users about how they use it.
Updated by knu (Akinori MUSHA) about 4 hours ago
Actions
#14
[ruby-core:124244]
I'm leaning toward doing these:
- Removing the Enumerator#to_set override that refuses to work against an infinite enumerator as a safeguard
- Reverting the default size of Enumerator.produce from nil to infinity
This way we can guarantee 100% backward compatibility at the expense of some safety against infinite enumerators with #to_set (and potentially #to_a in the future).
What do you think?
Updated by zverok (Victor Shepelev) about 4 hours ago
Actions
#15
[ruby-core:124245]
knu (Akinori MUSHA) wrote in #note-14:
I'm leaning toward doing these:
- Removing the Enumerator#to_set override that refuses to work against an infinite enumerator as a safeguard
- Reverting the default size of Enumerator.produce from nil to infinity
This way we can guarantee 100% backward compatibility at the expense of some safety against infinite enumerators with #to_set (and potentially #to_a in the future).
What do you think?
IMO, this seems like a great compromise!
Updated by mame (Yusuke Endoh) about 3 hours ago
Actions
#16
[ruby-core:124248]
I couldn't find any cases where Enumerator#size returns Float::INFINITY for a finite-length Enumerator, except Enumerator.produce and when explicitly creating a fake-sized enumerator with Enumerator.new(Float::INFINITY). Am I missing something?
raise StopIteration is indeed tricky, but even so, if it could be finite, I think it is correct for Enumerator.produce's size to return nil (the size is unknown).
What problem do you have if the size is nil?
Updated by knu (Akinori MUSHA) about 3 hours ago
Actions
#17
[ruby-core:124250]
mame (Yusuke Endoh) wrote in #note-16:
What problem do you have if the size is
nil?
Among others, Enumerator.produce(1) { it+1 }.lazy.take(5) now returns nil, which returned 5 previously.
https://github.com/ruby/ruby/blob/6b35f074bd83794007d4c7b773a289bddef0dbdf/enumerator.c#L2433
Updated by mame (Yusuke Endoh) about 3 hours ago
Actions
#18
[ruby-core:124251]
With Ruby 3.4:
Enumerator.produce(1) { raise StopIteration }.lazy.take(5).size #=> 5
Enumerator.produce(1) { raise StopIteration }.lazy.take(5).to_a.size #=> 1
I believe it is fair to call this behavior a bug.
Updated by knu (Akinori MUSHA) about 3 hours ago
Actions
#19
[ruby-core:124252]
The point of this discussion is that fixing the "bug" broke compatibility, and we are comparing the impacts. That "bug" can or should be properly fixed by changing the code to Enumerator.produce(1, size: nil) { raise StopIteration }.lazy.take(5) after 4.0 is out.
Updated by mame (Yusuke Endoh) about 2 hours ago
Actions
#20
[ruby-core:124253]
Yes, all bug fixes could be incompatibilities. What I am interested in is whether this incompatibility is actually causing a real problem, and if so, what kind of applications or libraries was affected.
Updated by zverok (Victor Shepelev) about 2 hours ago
Actions
#21
[ruby-core:124254]
@mame (Yusuke Endoh) My thinking (already outlined above) goes this way:
- while
Enumerator#size, I believe, is not used extensively, it might in the future (this very change ofEnumerator.producecall-sequence brings some attention to it) -
Enumerator::Lazy#takerespecting the infinity size is a (weak) argument thatEnumerator#sizematters at least sometimes, and it is good to have it aligned with the user's intuitions -
Enumerator.producebeing an infinite enumerator corresponds to its design. Yes, it can be broken from by an exception, but its more general behavior is "loop infinitely". So, basically:- if the developer relies on the default behavior (which produces an infinite sequence), they have the default size (infinity)
- if they consciously adjust behavior, by throwing exceptions, it might be OK for them (if they care at all), to adjust size to
nil/"unknown beforehand" or some known value.
The only reason the default for size: was chosen to be nil is that Enumerator.produce {...}.to_set was "broken". And, honestly, I don't think it is a good reason to muddy the semantics. With @knu's proposal to just stop checking #size is Enumerator#to_set seems to resolve this.
Are there clear advantages to keep it nil now, if the enumerator is infinite by design?
TL;DR: I don't think many devs care, but for those who do, Infinity is more reasonable for this enumerator.
Updated by mame (Yusuke Endoh) about 2 hours ago
Actions
#22
[ruby-core:124255]
I understand zverok's feeling. In fact, I thought Enumerator.produce always returned an infinite Enumerator. I'm surprised it can be stopped by StopIteration. However, since that functionality actually exists, I think #size has no choice but to return nil.
while
Enumerator#size, I believe, is not used extensively
I agree. And, that's precisely why I believe making #size return nil won't cause any real-world incompatibility issues.
Updated by zverok (Victor Shepelev) about 1 hour ago
Actions
#23
[ruby-core:124256]
However, since that functionality actually exists, I think #size has no choice but to return nil.
Respectfully, I disagree. I think it is much easier and more useful to explain it along the lines of...
It is infinite by implementation, but you can break it with exception (which is, well, for exceptional situations); if you care about size reporting, here is how you can adjust it (the new size: parameter introduced).
Yes, if you didn't pass it AND broken the enumerator, the #size "lies", but as the parameter exist, it is no different to "lie" in this situation:
e = Enumerator.new(5) { |y| 2.times { y << it } }
e.size #=> 5
e.to_a.size #=> 2
If the user cares, they will appreciate "you have control, the defaults are sane for the default usage". I agree there might be not many those who care, but the feature exists, and it exposes some semantics, and it might become more used in the future.
The discussion is mostly theoretical, I agree. But I believe that if Ruby 4.0 draws some attention to Enumertor.produce-enumerator sizes (by introducing the new parameter), it is better to use the most useful defaults. And I do believe that Infinity is the most useful.