Bug #8585
closedTime for CSV.generate grows quadratic with number of rows
Description
Hi,
I want to generate a CSV string, from millions of rows.
I see the time to create the string grows quadratic
with the amount of rows. With this issue, I cannot use
ruby 2.0.0 to create the CSV file.
I did not see this problem was not present in ruby 1.9.3.
I see the problem is present in ruby 2.0.0 and ruby-head.
Using ruby-head¶
Installed with rvm reinstall ruby-head
(built from version 3a01b9e)
peter_v@peter64:~/p/dbd$ rvm use ruby-head
Using /home/peter_v/.rvm/gems/ruby-head
peter_v@peter64:~/p/dbd$ ruby -v
ruby 2.1.0dev (2013-06-30) [x86_64-linux]
peter_v@peter64:~/p/dbd$ uname -a
Linux peter64 3.5.0-34-generic #55~precise1-Ubuntu SMP Fri Jun 7 16:25:50 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
peter_v@peter64:~/p/dbd$ rvm current
ruby-head
peter_v@peter64:~/p/dbd$ cat bin/test_4.rb
#!/usr/bin/env ruby
count = ARGV[0].to_i
unless count > 0
puts "Give a 'count' as first argument."
exit(1)
end
require 'csv'
row_data = [
"59ffbb3b-1e48-4c1f-81d8-d93afc84c966",
"2013-06-28 19:14:55.975000806 UTC",
"a11f290e-c441-41bc-8b8c-4e6c27b1b6fc",
"c73e6241-d46f-4952-8377-c11372346d15",
"test",
"BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB 0"]
puts "starting CSV.generate"
start_time = Time.now
csv_string = CSV.generate(force_quotes: true) do |csv|
count.times do
csv << row_data
end
end
puts "CSV.generate took #{Time.now - start_time} seconds"
peter_v@peter64:~/p/dbd$ time bin/test_4.rb 10_000
starting CSV.generate
CSV.generate took 1.01238478 seconds
real 0m1.045s
user 0m1.044s
sys 0m0.004s
peter_v@peter64:~/p/dbd$ time bin/test_4.rb 20_000
starting CSV.generate
CSV.generate took 3.815373614 seconds
real 0m3.847s
user 0m3.844s
sys 0m0.000s
peter_v@peter64:~/p/dbd$ time bin/test_4.rb 40_000
starting CSV.generate
CSV.generate took 17.176208859 seconds
real 0m17.212s
user 0m17.177s
sys 0m0.020s
peter_v@peter64:~/p/dbd$ time bin/test_4.rb 80_000
starting CSV.generate
CSV.generate took 71.400916725 seconds
real 1m11.436s
user 1m11.320s
sys 0m0.036s
peter_v@peter64:~/p/dbd$
Using ruby-1.9.3-p448¶
This is as expected LINEAR growth of time with number of rows.
peter_v@peter64:~/p/dbd$ rvm use ruby-1.9.3
Using /home/peter_v/.rvm/gems/ruby-1.9.3-p448
peter_v@peter64:~/p/dbd$ ruby -v
ruby 1.9.3p448 (2013-06-27 revision 41675) [x86_64-linux]
peter_v@peter64:~/p/dbd$ rvm current
ruby-1.9.3-p448
peter_v@peter64:~/p/dbd$ time bin/test_4.rb 10_000
starting CSV.generate
CSV.generate took 0.125396387 seconds
real 0m0.150s
user 0m0.140s
sys 0m0.008s
peter_v@peter64:~/p/dbd$ time bin/test_4.rb 20_000
starting CSV.generate
CSV.generate took 0.249746069 seconds
real 0m0.274s
user 0m0.268s
sys 0m0.004s
peter_v@peter64:~/p/dbd$ time bin/test_4.rb 40_000
starting CSV.generate
CSV.generate took 0.498180989 seconds
real 0m0.522s
user 0m0.504s
sys 0m0.016s
peter_v@peter64:~/p/dbd$ time bin/test_4.rb 80_000
starting CSV.generate
CSV.generate took 0.991481147 seconds
real 0m1.015s
user 0m1.000s
sys 0m0.016s
peter_v@peter64:~/p/dbd$ time bin/test_4.rb 100_000
starting CSV.generate
CSV.generate took 1.243347153 seconds
real 0m1.265s
user 0m1.240s
sys 0m0.020s
peter_v@peter64:~/p/dbd$ time bin/test_4.rb 1_000_000
starting CSV.generate
CSV.generate took 12.461711974 seconds
real 0m12.492s
user 0m12.405s
sys 0m0.080s
peter_v@peter64:~/p/dbd$
Files