Project

General

Profile

Actions

Bug #1075

closed

\r\n と \n が混在した CSV がエラーになる

Added by tommy (Masahiro Tomita) about 15 years ago. Updated almost 13 years ago.

Status:
Rejected
Target version:
-
ruby -v:
ruby 1.9.1p0 (2009-01-30 revision 21907) [i686-linux]
Backport:

Description

=begin
"a,"b\n",c\r\n" を CSV.new に渡すとエラーになります。

$ ruby -v -rcsv -e 'p CSV.parse("a,"b\n",c\r\n")'
ruby 1.9.1p0 (2009-01-30 revision 21907) [i686-linux]
/usr/local/ruby-1.9.1/lib/ruby/1.9.1/csv.rb:1863:in block (2 levels) in shift': Unquoted fields do not allow \r or \n (line 1). (CSV::MalformedCSVError) from /usr/local/ruby-1.9.1/lib/ruby/1.9.1/csv.rb:1853:in gsub!'
from /usr/local/ruby-1.9.1/lib/ruby/1.9.1/csv.rb:1853:in block in shift' from /usr/local/ruby-1.9.1/lib/ruby/1.9.1/csv.rb:1815:in loop'
from /usr/local/ruby-1.9.1/lib/ruby/1.9.1/csv.rb:1815:in shift' from /usr/local/ruby-1.9.1/lib/ruby/1.9.1/csv.rb:1760:in each'
from /usr/local/ruby-1.9.1/lib/ruby/1.9.1/csv.rb:1771:in to_a' from /usr/local/ruby-1.9.1/lib/ruby/1.9.1/csv.rb:1771:in read'
from /usr/local/ruby-1.9.1/lib/ruby/1.9.1/csv.rb:1360:in parse' from -e:1:in '

1.8.7 ではエラーになりません。

$ ruby -v -rcsv -e 'p CSV.parse("a,"b\n",c\r\n")'
ruby 1.8.7 (2008-08-11 patchlevel 72) [i486-linux]
[["a", "b\n", "c"]]
=end

Actions #1

Updated by ko1 (Koichi Sasada) about 15 years ago

  • Assignee set to JEG2 (James Gray)

=begin

=end

Actions #2

Updated by JEG2 (James Gray) almost 15 years ago

  • Status changed from Open to Rejected

=begin
Ruby 1.9 uses an all new CSV library. It's somewhat more strict in it's parsing as a means to get dramatically more speed.

Here it is correctly reporting that \r is an illegal character in an unquoted field. That rule comes from the CSV RFC.

The reason it isn't treated as a line ending is that the new library tried to guess your line ending by default. When doing so, the first thing it saw was the bare \n in the second quoted field. Thus it assumed the line ending was a \n. However, it looks like the real line ending here is \r\n.

You can easily fix this by just explicitly setting the line ending, so CSV won't guess it, wrongly in this case:

$ ruby_dev -v -rcsv -e 'p CSV.parse("a,"b\n",c\r\n", row_sep: "\r\n")'
ruby 1.9.1p129 (2009-05-12 revision 23412) [i386-darwin9.6.0]
[["a", "b\n", "c"]]

Hope that helps.

=end

Actions

Also available in: Atom PDF

Like0
Like0Like0