Feature #20069
closedBuffer class in stdlib
Added by pynix (Pynix wang) 7 months ago. Updated 7 months ago.
Description
ruby use String
to deal with bytes, this cause error on irb "invalid byte sequence in utf-8"
can we get a builtin class like Buffer or Bytes that represent as hex string
Updated by nobu (Nobuyoshi Nakada) 7 months ago
What's the use case?
Does it differ from IO::Buffer
?
Updated by pynix (Pynix wang) 7 months ago
main use case is deal binary data,a replacement of String.
eg grpc bytes type, crypto key and more.
maybe not same as io buffer, so Bytes or Binary is good for class name.
Updated by pynix (Pynix wang) 7 months ago
irb(main):048> SecureRandom.bytes(10)
=> "\xB6e\x1C\xF3T\x9C\xA1\xDF\xBD\xEA"
irb(main):049> SecureRandom.bytes(10)
=> "\"\xC4;0\xB3\xA6!\x80jn"
irb(main):050> SecureRandom.bytes(10)
=> "\x9B\x9CP\t~\"\xB9\x8EAn"
irb(main):051> SecureRandom.bytes(10)
=> "\xAA\xDEf\x92\x8E\xEE5]\xD0\xB2"
irb(main):052> SecureRandom.bytes(10)
=> "\xFD\xE9\xF5@n\x1D\x9D\xB4\xB7\x8A"
irb(main):053> SecureRandom.bytes(10)
=> "\xCB\x90\xB0\xCB\xDF\xD2\xED\xAA\a\\"
irb(main):054> SecureRandom.bytes(10)
=> "\x16p\x15\x12\xC0\xD6\x02*D\xDB"
irb(main):055> SecureRandom.bytes(10)
=> "F\x97\xC2d\x84\a\x87\xA3P\b"
irb(main):056> SecureRandom.bytes(10)
=> "\x98\xAB\xA1\x96\x15\x91\x92\xF8e5"
irb(main):057> SecureRandom.bytes(10)
=> "\xE3\xD0\xB5P\x95ys\x0E\xCF'"
irb(main):058> SecureRandom.bytes(10)
=> "\xC5\xFB\x04\x97\xFC\xC0\xF5\xEF{\xA2"
use String as Bytes get a non unified representation, some bytes is translated into string, some not.
Updated by duerst (Martin Dürst) 7 months ago
pynix (Pynix wang) wrote:
ruby use
String
to deal with bytes
Ruby uses big classes. That avoids duplicating a lot of functionality in many classes, and also avoids a lot of conversion operations.
this cause error on irb "invalid byte sequence in utf-8"
Can you show an example?
pynix (Pynix wang) wrote in #note-3:
irb(main):058> SecureRandom.bytes(10)
=> "\xC5\xFB\x04\x97\xFC\xC0\xF5\xEF{\xA2"
use String as Bytes get a non unified representation, some bytes is translated into string, some not.
You can get a uniform representation with SecureRandom.hex(10)
.
Updated by austin (Austin Ziegler) 7 months ago
pynix (Pynix wang) wrote:
ruby use
String
to deal with bytes, this cause error on irb "invalid byte sequence in utf-8"can we get a builtin class like Buffer or Bytes that represent as hex string
class Bytes < String
def inspect
bytes.pack("c*").unpack1("H*")
end
end
s = Bytes.new(SecureRandom.bytes(10))
What might be more interesting than suggesting an unnecessary class, but suggesting a different #inspect
if the encoding is ASCII-8BIT
or BINARY
(because SecureRandom.bytes(10).encoding # => #<Encoding:ASCII-8BIT>
, which will eventually be called Encoding:BINARY
).I’m not sure what such inspect should be, because the inspect that I wrote above for Bytes
is both inefficient and incorrect (because the representation is not what is shown on #inspect
, which differs from other strings).
Maybe:
class Bytes < String
def inspect
"#<#{encoding}:#{bytes.pack("c*").unpack1("H*")}>"
end
end
Bytes.new(SecureRandom.bytes(10))
=> #<ASCII-8BIT:1c2dc1463d30c6ed0b9a>
Updated by shan (Shannon Skipper) 7 months ago
pynix (Pynix wang) wrote:
ruby use
String
to deal with bytes, this cause error on irb "invalid byte sequence in utf-8"
I'm curious, did you actually run into an "invalid byte sequence" error? If so, could you show the code that produced the error?
I know IO::Buffer has already been mentioned, but just wanted to point out it inspects with pretty hex.
>> IO::Buffer.for SecureRandom.random_bytes
=>
#<IO::Buffer 0x00007f7dfa885998+16 EXTERNAL READONLY SLICE>
#0x00000000 17 bc 59 2d 8b 66 4b 6a 56 96 97 98 5e 07 45 d6 ..Y-.fKjV...^.E.
Updated by ioquatix (Samuel Williams) 7 months ago
ruby use String to deal with bytes, this cause error on irb "invalid byte sequence in utf-8"
This is desirable behaviour. The String with UTF-8 encoding cannot contain invalid byte sequences. If you want to store binary data, use Encoding::BINARY
encoding.
can we get a builtin class like Buffer or Bytes that represent as hex string
A Binary string already presents the inspect output as a hex encoded String.
As others have pointed out, if you want actual memory mapped binary buffers, use IO::Buffer
.
Updated by matz (Yukihiro Matsumoto) 7 months ago
- Status changed from Open to Closed
Use either string with BINARY
encoding or IO::Buffer
. If these two lack what you want, open a new issue please.
Matz.