Project

General

Profile

Backport #5635

String#unpack("M") の不正データ時の振る舞い

Added by tommy (Masahiro Tomita) over 7 years ago. Updated over 6 years ago.

Status:
Closed
Priority:
Normal
[ruby-dev:44875]

Description

String#unpack("M") で "=hoge" みたいな不正なデータがあった場合、現在は
そこで処理を中断してしまっていますが、それ以降のデータをすべて捨ててし
まうのはかわいそうなので、ポインタを一つ進めて処理を継続した方がいいん
じゃないかと思うのですがどうでしょうか。

RFC2045 には次のような記述があります。

(2)   An "=" followed by a character that is neither a
      hexadecimal digit (including "abcdef") nor the CR
      character of a CRLF pair is illegal.  This case can be
      the result of US-ASCII text having been included in a
      quoted-printable part of a message without itself
      having been subjected to quoted-printable encoding.  A
      reasonable approach by a robust implementation might be
      to include the "=" character and the following
      character in the decoded data without any
      transformation and, if possible, indicate to the user
      that proper decoding was not possible at this point in
      the data.

Index: pack.c

--- pack.c (リビジョン 33758)
+++ pack.c (作業コピー)
@@ -2008,20 +2008,23 @@

    while (s < send) {
        if (*s == '=') {
  • if (++s == send) break;
  • if (s+1 < send && *s == '\r' && *(s+1) == '\n')
  • s++;
  • if (*s != '\n') {
  • if ((c1 = hex2num(*s)) == -1) break;
  • if (++s == send) break;
  • if ((c2 = hex2num(*s)) == -1) break;
  • *ptr++ = c1 << 4 | c2;
  • if (s+1 < send && *(s+1) == '\n') {
  • s += 2;
  • continue;
  • }
  • if (s+2 < send) {
  • if (*(s+1) == '\r' && *(s+2) == '\n') {
  • s += 3;
  • continue;
  • }
  • if ((c1 = hex2num((s+1))) > -1 && (c2 = hex2num((s+2))) > -1) {
  • *ptr++ = c1 << 4 | c2;
  • s += 3;
  • continue;
  • } } }
  • else {
  • *ptr++ = *s;
  • }
  • s++;
  •       *ptr++ = *s++;
    }
    rb_str_set_len(buf, ptr - RSTRING_PTR(buf));
    ENCODING_CODERANGE_SET(buf, rb_ascii8bit_encindex(), ENC_CODERANGE_VALID);
    

    Index: test/ruby/test_pack.rb

    --- test/ruby/test_pack.rb (リビジョン 33758)
    +++ test/ruby/test_pack.rb (作業コピー)
    @@ -612,6 +612,17 @@
    assert_equal([0x100000000], "\220\200\200\200\000".unpack("w"), [0x100000000])
    end

  • def test_pack_unpack_M

  • assert_equal(["pre123after"], "pre=31=32=33after".unpack("M"))

  • assert_equal(["preafter"], "pre=\nafter".unpack("M"))

  • assert_equal(["preafter"], "pre=\r\nafter".unpack("M"))

  • assert_equal(["pre="], "pre=".unpack("M"))

  • assert_equal(["pre=\r"], "pre=\r".unpack("M"))

  • assert_equal(["pre=hoge"], "pre=hoge".unpack("M"))

  • assert_equal(["pre=1after"], "pre==31after".unpack("M"))

  • assert_equal(["pre==1after"], "pre===31after".unpack("M"))

  • end
    +
    def test_modify_under_safe4
    s = "foo"
    assert_raise(SecurityError) do

Associated revisions

Revision b7df3e9f
Added by naruse (Yui NARUSE) about 7 years ago

  • pack.c (pack_unpack): when unpack('M') occurs an illegal byte sequence, output the "=" character and the following character in the decoded data without any transformation. [ruby-dev:44875] [Bug #5635]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@34972 b2dd03c8-39d4-4d8f-98ff-823fe69b080e

Revision 34972
Added by naruse (Yui NARUSE) about 7 years ago

  • pack.c (pack_unpack): when unpack('M') occurs an illegal byte sequence, output the "=" character and the following character in the decoded data without any transformation. [ruby-dev:44875] [Bug #5635]

Revision 34972
Added by naruse (Yui NARUSE) about 7 years ago

  • pack.c (pack_unpack): when unpack('M') occurs an illegal byte sequence, output the "=" character and the following character in the decoded data without any transformation. [ruby-dev:44875] [Bug #5635]

Revision 34972
Added by naruse (Yui NARUSE) about 7 years ago

  • pack.c (pack_unpack): when unpack('M') occurs an illegal byte sequence, output the "=" character and the following character in the decoded data without any transformation. [ruby-dev:44875] [Bug #5635]

Revision 34972
Added by naruse (Yui NARUSE) about 7 years ago

  • pack.c (pack_unpack): when unpack('M') occurs an illegal byte sequence, output the "=" character and the following character in the decoded data without any transformation. [ruby-dev:44875] [Bug #5635]

Revision 34972
Added by naruse (Yui NARUSE) about 7 years ago

  • pack.c (pack_unpack): when unpack('M') occurs an illegal byte sequence, output the "=" character and the following character in the decoded data without any transformation. [ruby-dev:44875] [Bug #5635]

Revision 34972
Added by naruse (Yui NARUSE) about 7 years ago

  • pack.c (pack_unpack): when unpack('M') occurs an illegal byte sequence, output the "=" character and the following character in the decoded data without any transformation. [ruby-dev:44875] [Bug #5635]

Revision b8006774
Added by naruse (Yui NARUSE) over 6 years ago

merge revision(s) 34972:

* pack.c (pack_unpack): when unpack('M') occurs an illegal byte
  sequence, output the "=" character and the following character in
  the decoded data without any transformation.
  [ruby-dev:44875] [Bug #5635]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/branches/ruby_1_9_3@36669 b2dd03c8-39d4-4d8f-98ff-823fe69b080e

Revision 36669
Added by naruse (Yui NARUSE) over 6 years ago

merge revision(s) 34972:

* pack.c (pack_unpack): when unpack('M') occurs an illegal byte
  sequence, output the "=" character and the following character in
  the decoded data without any transformation.
  [ruby-dev:44875] [Bug #5635]

History

Updated by naruse (Yui NARUSE) over 7 years ago

うーん、RFCに変換せずにそのままくっつけとけとあるのでしたらその通りにした方がいいように思うのですが

Updated by tommy (Masahiro Tomita) over 7 years ago

あれ? 不正なデータについては、そのままにしてるつもりなのですが。
少なくとも今の不正なデータ以降全部削ってしまう動きよりはいいんじゃないかと…。

Updated by naruse (Yui NARUSE) over 7 years ago

わたしには処理を継続しないと読めるんですがどうなんでしょう。
まぁ、RFCよりもこの手の通信系は長いものに巻かれるのが正しい気もするので、他の実装の例でも。

Updated by tommy (Masahiro Tomita) over 7 years ago

あ~、なるほど、不正なデータに遭遇したら、それ以降のデータは一切変換するな…と。
確かにそっちの方がいいような気がします。

Updated by ko1 (Koichi Sasada) about 7 years ago

  • Assignee set to naruse (Yui NARUSE)
#6

Updated by naruse (Yui NARUSE) about 7 years ago

  • Status changed from Open to Closed
  • % Done changed from 0 to 100

This issue was solved with changeset r34972.
Masahiro, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.


  • pack.c (pack_unpack): when unpack('M') occurs an illegal byte sequence, output the "=" character and the following character in the decoded data without any transformation. [ruby-dev:44875] [Bug #5635]
#7

Updated by naruse (Yui NARUSE) over 6 years ago

  • Tracker changed from Bug to Backport
  • Project changed from Ruby trunk to Backport193

Also available in: Atom PDF