Project

General

Profile

Actions

Bug #18768

closed

Inconsistent behavior of IO, StringIO and String each_line methods when return paragraph and chomp: true passed

Added by andrykonchin (Andrew Konchin) over 2 years ago. Updated over 2 years ago.

Status:
Closed
Assignee:
-
Target version:
-
[ruby-core:108500]

Description

In IO, StringIO and String methods that return lines (e.g. each, each_line, gets, readline, readlines) behave in a different way when the following parameters are passed

  • separator is "" and
  • chomp is true.

They truncate the new line characters between paragraphs and the trailing one differently:

"a\n\nb\n\nc\n".each_line("", chomp: true).to_a
#=> ["a\n", "b\n", "c\n"]

StringIO.new("a\n\nb\n\nc\n").each_line("", chomp: true).to_a
#=> ["a\n", "b\n", "c"]

File.open('chomp.txt').each_line("", chomp: true).to_a
#=> ["a", "b", "c\n"]

The text file content is the same:

File.read('chomp.txt')
#=> "a\n\nb\n\nc\n"

Expected behavior - they return the same result.


Related issues 1 (0 open1 closed)

Related to Ruby master - Bug #18770: Inconsistent behavior of IO/StringIO's each methods when called with nil as a separator, limit and chomp: trueClosedActions
Actions #1

Updated by andrykonchin (Andrew Konchin) over 2 years ago

  • Description updated (diff)
Actions #2

Updated by jeremyevans0 (Jeremy Evans) over 2 years ago

  • Related to Bug #18770: Inconsistent behavior of IO/StringIO's each methods when called with nil as a separator, limit and chomp: true added

Updated by jeremyevans0 (Jeremy Evans) over 2 years ago

I agree that the behavior for all of these methods should be the same. I think the IO behavior makes the most sense, and String and StringIO should be changed to match it.

I submitted pull requests to Ruby (https://github.com/ruby/ruby/pull/5960) and StringIO (https://github.com/ruby/stringio/pull/29) to fix this.

Actions #4

Updated by jeremyevans (Jeremy Evans) over 2 years ago

  • Status changed from Open to Closed

Applied in changeset git|609d73e8925f807786686caf635178bb1de74256.


[ruby/stringio] Fix handling of chomp with paragraph separator

Try to mirror IO behavior, where chomp takes out the entire paragraph
separators between entries, but does not chomp a single line separator
at the end of the string.

Partially Fixes [Bug #18768]

https://github.com/ruby/stringio/commit/a83ddbb7f0

Actions #5

Updated by jeremyevans0 (Jeremy Evans) over 2 years ago

  • Status changed from Closed to Open

Updated by mame (Yusuke Endoh) over 2 years ago

I have no strong opinion about this issue and am never against Jeremy's change, but I'm just curious. I thought that each_line(..., chomp: true) {|s| ... } was equal to each_line(...) {|s| s = s.chomp; ... }. In the above examples, StringIO's result (["a\n", "b\n", "c"]) was the simplest to me. @jeremyevans0 (Jeremy Evans) How did you think the IO behavior made the most sense?

Updated by jeremyevans0 (Jeremy Evans) over 2 years ago

mame (Yusuke Endoh) wrote in #note-6:

I have no strong opinion about this issue and am never against Jeremy's change, but I'm just curious. I thought that each_line(..., chomp: true) {|s| ... } was equal to each_line(...) {|s| s = s.chomp; ... }. In the above examples, StringIO's result (["a\n", "b\n", "c"]) was the simplest to me. @jeremyevans0 (Jeremy Evans) How did you think the IO behavior made the most sense?

My thought here is that chomp: true, does not mean chomp with no arguments, it means chomp with the separator used for the each_line. So when you call each_line with '' (paragraph separator), it only chomps the paragraph separator. That's what the IO behavior already was, and I thought it best to make String/StringIO consistent with that.

Updated by mame (Yusuke Endoh) over 2 years ago

@jeremyevans0 (Jeremy Evans) Thank you for your explanation, I understand. The keyword name "chomp" might be a bit confusing.

Updated by jeremyevans0 (Jeremy Evans) over 2 years ago

mame (Yusuke Endoh) wrote in #note-8:

@jeremyevans0 (Jeremy Evans) Thank you for your explanation, I understand. The keyword name "chomp" might be a bit confusing.

I agree, the behavior is not obvious from the name chomp. However, it is obvious that chomp is supposed to remove the separator and not the generic newline. You can see this when providing an explicit line ending character:

'abc'.each_line('b', chomp: true).to_a
# => ["a", "c"]

require 'stringio'
StringIO.new('abc').each_line('b', chomp: true).to_a
# => ["a", "c"]

r, w = IO.pipe
w.write('abc')
w.close
r.each_line('b', chomp: true).to_a
# => ["a", "c"]

Updated by mame (Yusuke Endoh) over 2 years ago

We discussed this at the dev meeting, and @matz (Yukihiro Matsumoto) agreed with @jeremyevans0 (Jeremy Evans): chomp: true should remove the separator. Could you please merge the change?

Actions #11

Updated by jeremyevans (Jeremy Evans) over 2 years ago

  • Status changed from Open to Closed

Applied in changeset git|423b41cba77719b4f62aa530593ad36a990f7c74.


Make String#each_line work correctly with paragraph separator and chomp

Previously, it was including one newline when chomp was used,
which is inconsistent with IO#each_line behavior. This makes
behavior consistent with IO#each_line, chomping all paragraph
separators (multiple consecutive newlines), but not single
newlines.

Partially Fixes [Bug #18768]

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0