Bug #20185
closedString#ascii_only? buggy in ruby 3.3
Description
This was the smallest reduction of the bug I could come up with:
require "stringio"
puts StringIO::VERSION
def is_ascii(buffer)
str = buffer.string
puts "\"#{str}\" is ascii: #{str.ascii_only?}"
end
buffer = StringIO.new("".b)
buffer.write("a=b&c=d")
buffer.rewind
is_ascii(buffer)
buffer.write "богус"
is_ascii(buffer)
# in ruby 3.3
#=> 3.1.0
#=> "a=b&c=d" is ascii: true
#=> "богус" is ascii: true
# in ruby 3.2
#=> 3.0.4
#=> "a=b&c=d" is ascii: true
#=> "богус" is ascii: true
# in ruby 3.1
#=> 3.0.1
#=> "a=b&c=d" is ascii: true
#=> "богус" is ascii: false
I believe that only the 3.1 result is correct, as "богус" first character is not ascii.
Updated by andrykonchin (Andrew Konchin) 12 months ago
I cannot reproduce the issue with plain String (without StringIO) on Ruby 3.2, 3.1 and 3.0. ascii_only?
reports false
for "богус":
ruby -e 'p "богус".ascii_only?'
false
I believe in the examples involving StringIO the observed behaviour is caused by preserving StringIO#string
's encoding. StringIO instance is initialised with a String literal in binary
encoding. And any modification like writing doesn't change encoding even when a UTF-8 String is written:
io = StringIO.new "".b
io.string.encoding # => #<Encoding:ASCII-8BIT>
io.write "汉"
io.string.encoding # => #<Encoding:ASCII-8BIT>
In case of the "богус" String literal there are bytes greater than 127 so they are treated as non-ASCII:
io = StringIO.new "".b
io.write "богус"
io.string.bytes # => [208, 177, 208, 190, 208, 179, 209, 131, 209, 129]
Updated by nobu (Nobuyoshi Nakada) 11 months ago
- Status changed from Open to Closed
Updated by chucke (Tiago Cardoso) 11 months ago
nobu, can I ask why was the ticket closed? Even considering the comment from andrykonchin, he clearly points oot at the end that there are bytes greater than 128 in the string (therefore .ascii_only?
should be false).
Updated by jeremyevans0 (Jeremy Evans) 11 months ago
chucke (Tiago Cardoso) wrote in #note-3:
nobu, can I ask why was the ticket closed? Even considering the comment from andrykonchin, he clearly points oot at the end that there are bytes greater than 128 in the string (therefore
.ascii_only?
should be false).
This was fixed by 6283ae8d369bd2f8a022bb69bc5b742c58529dec
Updated by chucke (Tiago Cardoso) 11 months ago
Apologies everyone, got temporary redmine visual impairment. Thank you.
Updated by Eregon (Benoit Daloze) 11 months ago
Indeed on Redmine I see no link to the commit in https://bugs.ruby-lang.org/issues/20185?tab=history#note-2, it seems like a bug.
Updated by hsbt (Hiroshi SHIBATA) 11 months ago
No, Fix https://bugs.ruby-lang.org/issues/20185
of commit message is not correct format for redmine autolink.