Feature #19245
openStrict mode for Array#pack that doesn't silently truncate numbers that are too large for the given directive
Description
>> [256].pack("C").unpack1("C")
=> 0
>> [257].pack("C").unpack1("C")
=> 1
This is specified:
it "encodes the least significant 32 bits of a negative number" do
[ [[-0x0000_0021], "\xdf\xff\xff\xff"],
[[-0x0000_4321], "\xdf\xbc\xff\xff"],
[[-0x0065_4321], "\xdf\xbc\x9a\xff"],
[[-0x7865_4321], "\xdf\xbc\x9a\x87"]
].should be_computed_by(:pack, pack_format())
end
But not documented in Array#pack
.
I think that in many case this may lead to silent bugs.
Possible solutions¶
We could have a strict version of pack
, either pack(template, strict: true)
or pack!(template)
.
Or alternatively if we think this is never a desired behavior, we could change pack
to first warn on truncation and later raise.
Updated by byroot (Jean Boussier) about 2 years ago
In the meantime I'd like to document this before we release 3.2 https://github.com/ruby/ruby/pull/6969
Updated by nobu (Nobuyoshi Nakada) about 2 years ago
byroot (Jean Boussier) wrote:
This is specified:
Note that “ruby-spec” is not the specification of ruby.
The word “specified” makes confusion.
Possible solutions¶
We could have a strict version of
pack
, eitherpack(template, strict: true)
orpack!(template)
.
Usually we don’t add !
for stricter behavior.
Updated by byroot (Jean Boussier) about 2 years ago
The word “specified” makes confusion.
Apologies.
Usually we don’t add ! for stricter behavior.
Indeed. If anything it's the version that silently truncate that would be the "more dangerous" version of it, which is usually what !
suffix tend to mean in ruby-core. But that's not really an option. So I guess strict: true
is the way to go.
Updated by Eregon (Benoit Daloze) about 2 years ago
FWIW, #15460 was another discussion about "implicit modulo".
I'm of the opinion it's better to error there and I think this behavior is rarely if ever wanted.
FFI::Pointer does raise for out-of-bounds integers, and that feels very similar to pack
:
> FFI::MemoryPointer.new(16).write_int(2**32)
(irb):11:in `write_int32': integer 4294967296 too big to convert to `int' (RangeError)
Which seems like the right thing to do, losing data should never be silent IMHO.
Updated by byroot (Jean Boussier) about 2 years ago
Interesting that the setbyte
conclusion was to implicitly modulo.
Whereas Integer#chr
does raise RangeError
:
>> 256.chr
in `chr': 256 out of char range (RangeError)
Updated by Anonymous about 2 years ago
- Status changed from Open to Closed
Applied in changeset git|88040063d0ec8aa64e0de2a3afae7286ec53bfdb.
Array#pack: document silent truncation
Ref: [Feature #19245]
At the very least this behavior should be documented.
Updated by byroot (Jean Boussier) about 2 years ago
- Status changed from Closed to Open
Updated by matz (Yukihiro Matsumoto) almost 2 years ago
I don't think it would be default, but adding strict:
(or any other keyword argument) is OK for me.
Matz.
Updated by mame (Yusuke Endoh) almost 2 years ago
Discussed at the dev meeting. We need to determine some detailed behaviors:
- Should
[-1].pack("C", strict: true)
raise an exception because 'C' represents unsigned char? If what you want is "wrap around", it should raise an error because[-1].pack("C").unpack1("C")
returns 255, not -1. - Should
[255].pack("c", strict: true)
raise an exception? - Should
["foo"].pack("a2", strict: true)
raise an exception? - Should
["f"].pack("a2", strict: true)
raise an exception? - Perhaps we need to consider what is out of range for each format.
And just for the record: @nobu (Nobuyoshi Nakada) proposed a new modifier to check out-of-range value, such as [256].pack("c^") #=> error
. This allows users to determine out-of-range policy per element in one format. However, @matz (Yukihiro Matsumoto) disliked it.