Project

General

Profile

Actions

Bug #15770

closed

CSV skip_lines param affects data

Added by skyksandr (Aleksandr Kunin) about 5 years ago. Updated about 5 years ago.

Status:
Closed
Target version:
-
[ruby-core:92293]

Description

It works on 2.5.*, but doesn't work on 2.6.*

require 'csv'
require 'date'

counter = 0

CSV.foreach('./05-31-20.CSV', skip_lines: /^[^0-9]{4}/) do |row|
  time = row[0]

  p time if time.length < 23
  counter += 1
end

p "Processed: #{counter} lines"

And the result is:

"03-09T09:40:04.00Z"
"Processed: 4424 lines"

So there are two problems:

  1. Line 4424 got corrupted by slicing 5 symbols ("2019-")
  2. Not whole file is parsed, total number of lines: 4497

EDIT:
With regex /^(?![0-9]{4})/ in addition to corrupt first field parser hangs in infinite loop.
Stack (to give you an idea where to look to):

	7: from ~/.rbenv/versions/2.6.2/lib/ruby/2.6.0/csv.rb:509:in `foreach'
	6: from ~/.rbenv/versions/2.6.2/lib/ruby/2.6.0/csv.rb:657:in `open'
	5: from ~/.rbenv/versions/2.6.2/lib/ruby/2.6.0/csv.rb:510:in `block in foreach'
	4: from ~/.rbenv/versions/2.6.2/lib/ruby/2.6.0/csv.rb:1176:in `each'
	3: from ~/.rbenv/versions/2.6.2/lib/ruby/2.6.0/csv/parser.rb:265:in `parse'
	2: from ~/.rbenv/versions/2.6.2/lib/ruby/2.6.0/csv/parser.rb:583:in `skip_needless_lines'
	1: from ~/.rbenv/versions/2.6.2/lib/ruby/2.6.0/csv/parser.rb:704:in `parse_row_end'

EDIT2: the issue is reproducible on 3.0.4, but resolved on csv 3.0.9


Files

05-31-20.CSV (491 KB) 05-31-20.CSV Affected CSV file skyksandr (Aleksandr Kunin), 04/15/2019 11:10 AM
bug.rb (515 Bytes) bug.rb Reproducible steps skyksandr (Aleksandr Kunin), 04/15/2019 11:20 AM
Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0