Bug #7646
closedString#each_lineでinvalid byte sequence
Description
=begin
String#each_lineでセパレータを指定したときにASCII以外の文字でinvalid byte sequenceが発生します。
$ ruby -ve '"\n\u0100".each_line("\n") {|l| p l }'
ruby 2.0.0dev (2013-01-02 trunk 38676) [i686-linux]
"\n"
-e:1:in each_line': invalid byte sequence in UTF-8 (ArgumentError) from -e:1:in
'
r38616あたりの変更で入ったバグのようです。
--- string.c.org 2012-12-27 21:57:07.000000000 +0900
+++ string.c 2013-01-02 23:36:47.000000000 +0900
@@ -6199,14 +6199,14 @@
if (c == newline &&
(rslen <= 1 ||
(pend - p >= rslen && memcmp(RSTRING_PTR(rs), p, rslen) == 0))) {
-
p += (rslen ? rslen : n);
-
line = rb_str_subseq(str, s - ptr, p - s);
-
const char *pp = p + (rslen ? rslen : n);
-
line = rb_str_subseq(str, s - ptr, pp - s); if (wantarray) rb_ary_push(ary, line); else rb_yield(line); str_mod_check(str, ptr, len);
-
s = p;
-
}s = pp; } p += n;
=end
Updated by kosaki (Motohiro KOSAKI) about 12 years ago
- Category set to core
- Status changed from Open to Assigned
- Assignee set to nobu (Nobuyoshi Nakada)
- Priority changed from Normal to 5
- Target version set to 2.0.0
これはどうみても regressionじゃないかな。
2.0.0タグつけます。
Updated by Anonymous about 12 years ago
- % Done changed from 0 to 100
- Status changed from Assigned to Closed
This issue was solved with changeset r38704.
Yoshida, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.
-
string.c (rb_str_enumerate_lines): fix invalid byte sequence error
when a separator is passed. The patch is from yoshidam (Yoshida
Masato).
[Bug #7646] [ruby-dev:46827] -
test/ruby/test_string.rb: a test for above.