Bug #15460: Behaviour of String#setbyte changed - Ruby - Ruby Issue Tracking System

Actions

Copy link

Bug #15460

closed

Behaviour of String#setbyte changed

Added by gettalong (Thomas Leitner) over 6 years ago. Updated over 5 years ago.

Status:

Closed

Assignee:

shyouhei (Shyouhei Urabe)

Target version:

ruby -v:

ruby 2.6.0p0 (2018-12-25 revision 66547) [x86_64-linux]

Backport:

2.4: DONTNEED, 2.5: DONTNEED, 2.6: DONE

[ruby-core:90700]

Description

I just installed Ruby 2.6.0 for benchmarking reasons and found that the change string.c: setbyte silently ignores upper bits broke my library/application HexaPDF.

Before using String#setbyte I tested how it would respond to values lower than 0 or greater than 255 and found that it automatically performed the needed modulo 256 operation (at least up to Ruby 2.5.3). Therefore I left out the explicit modulo operation for performance reasons.

Would it make sense to change the String#setbyte implementation to perform the modulo operation? This would restore compatibility with prior Ruby versions and may be what people would expect.

Related issues 2 (0 open — 2 closed)

Actions

Copy link

#1 [ruby-core:90720]

Updated by ko1 (Koichi Sasada) over 6 years ago

FYI

7213568733f673da0d82f95e8a1bccf79ba3f0d3
Author: shyouhei <shyouhei@b2dd03c8-39d4-4d8f-98ff-823fe69b080e>
Date:   Mon Nov 19 09:52:46 2018 +0000

    string.c: setbyte silently ignores upper bits
    
    The behaviour of String#setbyte has been depending on the width
    of int, which is not portable.  Must check explicitly.
    
    
    git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65804 b2dd03c8-39d4-4d8f-98ff-823fe69b080e

:100644 100644 5a5cbb576c31350a416b2121e5efb85aaffa0676 e55c59136a49fb7d2d70805a65a4c7f56519f2e9 M    string.c
:040000 040000 2170f3be4ac9d27e26ce361c38beb7874c07436f e71fc9f00ed5602a30e9f504fc015b9bf7b4dad1 M    test

Actions

Copy link

#2 [ruby-core:90721]

Updated by shyouhei (Shyouhei Urabe) over 6 years ago

I feel sorry for the situation. I didn't expect any actual usage of that corner case.

However the previous behaviour was strange. It did actually raise exception when you pass a very huge number to it:

% ruby -v -e 'p IO.pipe[0].ungetbyte(18446744073709551616)'
ruby 1.9.3p551 (2014-11-13) [x86_64-darwin15.6.0]
-e:1:in `ungetbyte': can't convert Bignum into String (TypeError)
        from -e:1:in `<main>'

The "TypeError" in the error message is because the testing ruby is old. Now that Bignum is wiped out, I don't think this restriction is reasonable.

Actions

Copy link

#3 [ruby-core:90722]

Updated by shyouhei (Shyouhei Urabe) over 6 years ago

Apology for the previous comment. It shows IO#ungetbyte example. I confused them because I fixed them the same day for the same reason. The same thing happens for String#setbyte.

% ruby -v -e 'p "foo".setbyte(0,18446744073709551616)'
ruby 1.9.3p551 (2014-11-13) [x86_64-darwin15.6.0]
-e:1:in `setbyte': bignum too big to convert into `long' (RangeError)
        from -e:1:in `<main>'

Actions

Copy link

#4 [ruby-core:90723]

Updated by shyouhei (Shyouhei Urabe) over 6 years ago

We may want to define the behaviour of these methods without introducing fixnum / bignum distinction. One possible way is:

Index: io.c
===================================================================
--- io.c	(revision 66566)
+++ io.c	(working copy)
@@ -4259,8 +4259,8 @@ rb_io_ungetbyte(VALUE io, VALUE b)
     GetOpenFile(io, fptr);
     rb_io_check_byte_readable(fptr);
     if (NIL_P(b)) return Qnil;
-    if (FIXNUM_P(b)) {
-        int i = FIX2INT(b);
+    if (RB_TYPE_P(b, T_FIXNUM) || RB_TYPE_P(b, T_BIGNUM)) {
+        int i = NUM2INT(rb_int_modulo(b, INT2FIX(256)));
         if (0 <= i && i <= UCHAR_MAX) {
             unsigned char cc = i & 0xFF;
             b = rb_str_new((const char *)&cc, 1);
Index: string.c
===================================================================
--- string.c	(revision 66566)
+++ string.c	(working copy)
@@ -5411,7 +5411,7 @@ static VALUE
 rb_str_setbyte(VALUE str, VALUE index, VALUE value)
 {
     long pos = NUM2LONG(index);
-    int byte = NUM2INT(value);
+    int byte = NUM2INT(rb_int_modulo(value, INT2FIX(256)));
     long len = RSTRING_LEN(str);
     char *head, *left = 0;
     unsigned char *ptr;

Actions

Copy link

#5 [ruby-core:90730]

Updated by Eregon (Benoit Daloze) over 6 years ago

I think it's good to be strict here, i.e., to raise RangeError when str.setbyte(256) or higher as that could very well be a bug in the calling code.
Negative values also sound buggy in most situations.

Why are values higher in HexaPDF? Does it mean multiple bytes need to be written or is it enough to throw away the high bits?
I suspect the latter is rarely correct.

Actions

Copy link

Updated by naruse (Yui NARUSE) over 6 years ago

Backport changed from 2.4: UNKNOWN, 2.5: UNKNOWN, 2.6: UNKNOWN to 2.4: DONTNEED, 2.5: DONTNEED, 2.6: REQUIRED

Actions

Copy link

#7 [ruby-core:90772]

Updated by gettalong (Thomas Leitner) over 6 years ago

Eregon (Benoit Daloze) wrote:

Why are values higher in HexaPDF? Does it mean multiple bytes need to be written or is it enough to throw away the high bits?
I suspect the latter is rarely correct.

This actually happens in the predictor filter of HexaPDF (see https://github.com/gettalong/hexapdf/blob/master/lib/hexapdf/filter/predictor.rb#L166) which is an implementation of the PNG filter types (see https://www.w3.org/TR/2003/REC-PNG-20031110/#9Filter-types).

The PNG spec says that all operations are to be taken as unsigned arithmetic modulo 256 and since this is what String#setbyte originally did, I just left out the additional module operation.

It is not a huge problem for me and I will release a new HexaPDF version with the fix soon.

As for whether the modulo operation should be done by String#setbyte: Many methods automatically convert arguments to an expected type if possible, e.g. with #to_str. Converting an integer to a byte using modulo 256 seems like the most straight-forward way to do this for String#setbyte.

Actions

Copy link

#8 [ruby-core:91023]

Updated by shyouhei (Shyouhei Urabe) over 6 years ago

Status changed from Open to Assigned
Assignee set to shyouhei (Shyouhei Urabe)

I had a chance this week to ask matz if he wants to allow bigger inputs or not for those methods. He answered yes. He prefers mod 256 behaviour for larger numbers. I will fix them again.

Actions

Copy link

Updated by shyouhei (Shyouhei Urabe) over 6 years ago

Status changed from Assigned to Closed

Applied in changeset trunk|r66824.

setbyte / ungetbyte allow out-of-range integers

string.c: String#setbyte to accept arbitrary integers [Bug #15460]
io.c: ditto for IO#ungetbyte
ext/strringio/stringio.c: ditto for StringIO#ungetbyte

Actions

Copy link

#10 [ruby-core:91144]

Updated by naruse (Yui NARUSE) over 6 years ago

Backport changed from 2.4: DONTNEED, 2.5: DONTNEED, 2.6: REQUIRED to 2.4: DONTNEED, 2.5: DONTNEED, 2.6: DONE

ruby_2_6 r66845 merged revision(s) 66760,66761,66824.

Actions

Copy link

#11 [ruby-core:96049]

Updated by nobu (Nobuyoshi Nakada) over 5 years ago

On the 2.6 branch, this causes a declaration-after-statement warning.

Actions

Copy link

#12

Updated by mame (Yusuke Endoh) over 5 years ago

Related to Bug #10453: NUM2CHR() does not perform additional bounds checks added

Actions

Copy link

#13

Updated by Eregon (Benoit Daloze) 11 months ago

Related to Feature #20594: A new String method to append bytes while preserving encoding added

Actions

Copy link

Also available in: Atom PDF

Like0

Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0

	Related to Ruby - Bug #10453: NUM2CHR() does not perform additional bounds checks	Rejected	matz (Yukihiro Matsumoto)				Actions
	Related to Ruby - Feature #20594: A new String method to append bytes while preserving encoding	Closed	byroot (Jean Boussier)				Actions

Project

General

Profile

Ruby

Tags

Custom queries

Bug #15460

Behaviour of String#setbyte changed

Updated by ko1 (Koichi Sasada) over 6 years ago

Updated by shyouhei (Shyouhei Urabe) over 6 years ago

Updated by shyouhei (Shyouhei Urabe) over 6 years ago

Updated by shyouhei (Shyouhei Urabe) over 6 years ago

Updated by Eregon (Benoit Daloze) over 6 years ago

Updated by naruse (Yui NARUSE) over 6 years ago

Updated by gettalong (Thomas Leitner) over 6 years ago

Updated by shyouhei (Shyouhei Urabe) over 6 years ago

Updated by shyouhei (Shyouhei Urabe) over 6 years ago

Updated by naruse (Yui NARUSE) over 6 years ago

Updated by nobu (Nobuyoshi Nakada) over 5 years ago

Updated by mame (Yusuke Endoh) over 5 years ago

Updated by Eregon (Benoit Daloze) 11 months ago