Project

General

Profile

Bug #14388

不正エンコーディング文字列から切り出した正当なエンコーディング文字列が invalid encoding になる

Added by tommy (Masahiro Tomita) over 1 year ago. Updated about 1 year ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Target version:
-
ruby -v:
ruby 2.5.0p0 (2017-12-25 revision 61468) [x86_64-linux]
[ruby-dev:50424]

Description

data = "\xFFaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"
p data.encoding               #=> #<Encoding:UTF-8>
p data                        #=> "\xFFaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"
p data.valid_encoding?        #=> false
data2 = data[1..-1]
p data2                       #=> "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"
p data2.valid_encoding?       #=> false
data3 = data2 + ""
p data3.valid_encoding?       #=> true

data は invalid ですが、data から切り出した data2 は valid のはずです。
末尾に空文字列を追加すると valid になります。

Associated revisions

Revision 9237049e
Added by nobu (Nobuyoshi Nakada) over 1 year ago

string.c: clear substring code range

  • string.c (str_substr): substring of broken code range string may be valid or broken. patch by tommy (Masahiro Tomita) at [ruby-dev:50430] [Bug #14388].

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@62040 b2dd03c8-39d4-4d8f-98ff-823fe69b080e

Revision 62040
Added by nobu (Nobuyoshi Nakada) over 1 year ago

string.c: clear substring code range

  • string.c (str_substr): substring of broken code range string may be valid or broken. patch by tommy (Masahiro Tomita) at [ruby-dev:50430] [Bug #14388].

Revision 62040
Added by nobu (Nobuyoshi Nakada) over 1 year ago

string.c: clear substring code range

  • string.c (str_substr): substring of broken code range string may be valid or broken. patch by tommy (Masahiro Tomita) at [ruby-dev:50430] [Bug #14388].

Revision c1d4e3fe
Added by naruse (Yui NARUSE) over 1 year ago

merge revision(s) 62040: [Backport #14388]

    string.c: clear substring code range

    * string.c (str_substr): substring of broken code range string may
      be valid or broken.  patch by tommy (Masahiro Tomita) at
      [ruby-dev:50430] [Bug #14388].

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/branches/ruby_2_5@62483 b2dd03c8-39d4-4d8f-98ff-823fe69b080e

Revision 62483
Added by naruse (Yui NARUSE) over 1 year ago

merge revision(s) 62040: [Backport #14388]

string.c: clear substring code range

* string.c (str_substr): substring of broken code range string may
  be valid or broken.  patch by tommy (Masahiro Tomita) at
  [ruby-dev:50430] [Bug #14388].

Revision 681d1e79
Added by nagachika (Tomoyuki Chikanaga) over 1 year ago

merge revision(s) 62040: [Backport #14388]

    string.c: clear substring code range

    * string.c (str_substr): substring of broken code range string may
      be valid or broken.  patch by tommy (Masahiro Tomita) at
      [ruby-dev:50430] [Bug #14388].

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/branches/ruby_2_4@62875 b2dd03c8-39d4-4d8f-98ff-823fe69b080e

Revision 62875
Added by nagachika (Tomoyuki Chikanaga) over 1 year ago

merge revision(s) 62040: [Backport #14388]

string.c: clear substring code range

* string.c (str_substr): substring of broken code range string may
  be valid or broken.  patch by tommy (Masahiro Tomita) at
  [ruby-dev:50430] [Bug #14388].

Revision aaf1f031
Added by usa (Usaku NAKAMURA) about 1 year ago

merge revision(s) 62040: [Backport #14388]

    string.c: clear substring code range

    * string.c (str_substr): substring of broken code range string may
      be valid or broken.  patch by tommy (Masahiro Tomita) at
      [ruby-dev:50430] [Bug #14388].

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/branches/ruby_2_3@62946 b2dd03c8-39d4-4d8f-98ff-823fe69b080e

Revision 62946
Added by usa (Usaku NAKAMURA) about 1 year ago

merge revision(s) 62040: [Backport #14388]

string.c: clear substring code range

* string.c (str_substr): substring of broken code range string may
  be valid or broken.  patch by tommy (Masahiro Tomita) at
  [ruby-dev:50430] [Bug #14388].

History

Updated by tommy (Masahiro Tomita) over 1 year ago

Rubyの内部のコードにはあんまり詳しくないんですけど、これで直ると思うのですがどうでしょうか。

diff --git a/string.c b/string.c
index 82fa603ada..9079387fac 100644
--- a/string.c
+++ b/string.c
@@ -2560,6 +2560,7 @@ str_substr(VALUE str, long beg, long len, int empty)
    str2 = str_new_shared(rb_obj_class(str2), str2);
    RSTRING(str2)->as.heap.ptr += ofs;
    RSTRING(str2)->as.heap.len = len;
+   ENC_CODERANGE_CLEAR(str2);
     }
     else {
    if (!len && !empty) return Qnil;
#2

Updated by nobu (Nobuyoshi Nakada) over 1 year ago

  • Status changed from Open to Closed

Applied in changeset trunk|r62040.


string.c: clear substring code range

  • string.c (str_substr): substring of broken code range string may be valid or broken. patch by tommy (Masahiro Tomita) at [ruby-dev:50430] [Bug #14388].
#3

Updated by nagachika (Tomoyuki Chikanaga) over 1 year ago

  • Backport changed from 2.3: UNKNOWN, 2.4: UNKNOWN, 2.5: UNKNOWN to 2.3: REQUIRED, 2.4: REQUIRED, 2.5: REQUIRED

Updated by naruse (Yui NARUSE) over 1 year ago

  • Backport changed from 2.3: REQUIRED, 2.4: REQUIRED, 2.5: REQUIRED to 2.3: REQUIRED, 2.4: REQUIRED, 2.5: DONE

ruby_2_5 r62483 merged revision(s) 62040.

Updated by nagachika (Tomoyuki Chikanaga) over 1 year ago

  • Backport changed from 2.3: REQUIRED, 2.4: REQUIRED, 2.5: DONE to 2.3: REQUIRED, 2.4: DONE, 2.5: DONE

ruby_2_4 r62875 merged revision(s) 62040.

Updated by usa (Usaku NAKAMURA) about 1 year ago

  • Backport changed from 2.3: REQUIRED, 2.4: DONE, 2.5: DONE to 2.3: DONE, 2.4: DONE, 2.5: DONE

ruby_2_3 r62946 merged revision(s) 62040.

Also available in: Atom PDF