Feature #3608
openEnhancing Pathname#each_child to be lazy
Description
=begin
Right now it lists entire directory, then yields
every element, that is x.each_child(&b) means x.children.each(&b).
This is too slow for directories mounted over networked file systems etc.,
and there is currently no way to get lazy behaviour, other than leaving
convenient #each_child/#children API and moving to lower level.
With this patch:
- #children is eager like before, no change here
- #each_child becomes lazy
- #each_child without block returns lazy enumerator,
so it can be used like this dir.each_child.find(&:symlink?)
without losing laziness.
Patch is against trunk. pathname.rb was in lib/ not ext/pathname/lib/
before, but it works either way.
The part to return enumerator when called without a block wouldn't
work in 1.8. If backport is desired, that line would need to be thrown
away, and #children would need to build result array instead
of calling each_child(with_directory).to_a - this would be straightforward.
=end
Files
Updated by akr (Akira Tanaka) over 14 years ago
2010/7/24 Tomasz Wegrzanowski redmine@ruby-lang.org:
Feature #3608: Enhancing
Pathname#each_child
to be lazy
http://redmine.ruby-lang.org/issues/show/3608
Right now it lists entire directory, then yields
every element, that isx.each_child(&b)
meansx.children.each(&b)
.This is too slow for directories mounted over networked file systems etc.,
and there is currently no way to get lazy behaviour, other than leaving
convenient#each_child
/#children
API and moving to lower level.
A problem of the lazy behaviour that is it opens a file descriptor when
the block is called.
If the lazy each_child
is used for recursively, the limit of number of
descriptors limits the recursive levels.
I'm not sure which problem is important.
--
Tanaka Akira
Updated by taw (Tomasz Wegrzanowski) over 14 years ago
- File lazy_path_test.rb lazy_path_test.rb added
A problem of the lazy behaviour that is it opens a file descriptor when
the block is called.If the lazy
each_child
is used for recursively, the limit of number of
descriptors limits the recursive levels.I'm not sure which problem is important.
This won't normally be a problem as directory
handler isn't opened on to_enum
, only once
iteration actually begins.
Unless you put these enumerators on different fibres or
something like that, your maximum number of open
files will be limited by your file system depth
and also by stack depth, whichever is lower.
You'd need to have 100s of sub directories
nested in one another like 1/2/3/4/5/.../100,
and have all these nested on ruby stack.
Take a look at attached test code
(also at http://pastebin.org/439336 )
Even with ulimit -n
as low as 16 and
a lot of directories it works perfectly
(tested on 00/a - 99/z and on ruby source tree).
Test 1 shows that calling map(&:each_child)
won't open
directory handlers just yet.
Test 2 shows that each_child works all right with recursion.
Test 3 just verifies that ulimit -n
is applied.
Updated by akr (Akira Tanaka) over 13 years ago
- Project changed from Ruby to Ruby master
- Assignee set to akr (Akira Tanaka)
Updated by shyouhei (Shyouhei Urabe) over 12 years ago
- Status changed from Open to Assigned
Updated by mame (Yusuke Endoh) about 12 years ago
- Description updated (diff)
- Target version set to 2.6