Feature #6047
closed
read_all: Grow buffer exponentially in generic case
Added by MartinBosslet (Martin Bosslet) about 12 years ago.
Updated over 1 year ago.
Description
In the general case, read_all grows its buffer linearly by just the amount that is currently read from the underlying source. This results in a linear number of reallocs, It might turn out beneficial if the buffer were grown exponentially by multiplying with a constant factor (e.g. 1.5 or 2), thus resulting in only a logarithmic numver of reallocs.
I will provide a patch and benchmarks, but I'm already opening this issue so I won't forget.
See also https://bugs.ruby-lang.org/issues/5353 for more details.
ping. status?
Do you need helps or comments?
ko1 (Koichi Sasada) wrote:
ping. status?
Do you need helps or comments?
Thanks for your help, to be honest, I haven't tried so far. Can we leave it at 2.0.0 target for now? If I run into problems, I'll ask here!
Martin Bosslet Martin.Bosslet@googlemail.com wrote:
In the general case, read_all grows its buffer linearly by just the
amount that is currently read from the underlying source. This results
in a linear number of reallocs, It might turn out beneficial if the
buffer were grown exponentially by multiplying with a constant factor
(e.g. 1.5 or 2), thus resulting in only a logarithmic numver of
reallocs.
I think growing the buffer exponentially makes sense.
I would enforce a hard limit (probably <= 8 MB) for each growth,
to:
-
discourage read_all() for large files, it's very wasteful and
usually hurts performance
-
prevent memory exhaustion for edge cases (especially on 32-bit)
- Target version changed from 2.0.0 to 2.6
My experience also shows that it is useless to open a ticket for a reminder to myself :-)
I'm setting to next minor tentatively, but if it is really just a performance improvement (i.e., it affects no external modules), you can commit it to 2.0.0 before code freeze.
--
Yusuke Endoh mame@tsg.ne.jp
- Assignee changed from MartinBosslet (Martin Bosslet) to 7150
- Status changed from Assigned to Open
I just tried my hand at this one: https://github.com/ruby/ruby/pull/6829
I think such a change would make sense. Not that IO#read
without a size if common, but might as well do something sensible.
- Status changed from Open to Closed
Applied in changeset git|7390eb43fe1bfb069af80ba8f73f7dc4999df0fd.
io.c (read_all): grow the buffer exponentially when size is unknown
[Feature #6047]
Currently it's grown by BUFSIZ
(1024) on every iteration which is bit wasteful.
Instead we can double the capacity whenever there is less than BUFSIZ
capacity
left.
Also available in: Atom
PDF
Like0
Like0Like0Like0Like0Like0Like0Like0Like0