Project

General

Profile

Actions

Bug #20185

closed

String#ascii_only? buggy in ruby 3.3

Bug #20185: String#ascii_only? buggy in ruby 3.3

Added by chucke (Tiago Cardoso) almost 2 years ago. Updated almost 2 years ago.

Status:
Closed
Assignee:
-
Target version:
-
[ruby-core:116203]

Description

This was the smallest reduction of the bug I could come up with:

require "stringio"

puts StringIO::VERSION

def is_ascii(buffer)
  str = buffer.string
  puts "\"#{str}\" is ascii: #{str.ascii_only?}"
end

buffer = StringIO.new("".b)

buffer.write("a=b&c=d")
buffer.rewind
is_ascii(buffer)
buffer.write "богус"
is_ascii(buffer)

# in ruby 3.3
#=> 3.1.0
#=> "a=b&c=d" is ascii: true
#=> "богус" is ascii: true

# in ruby 3.2
#=> 3.0.4
#=> "a=b&c=d" is ascii: true
#=> "богус" is ascii: true

# in ruby 3.1
#=> 3.0.1
#=> "a=b&c=d" is ascii: true
#=> "богус" is ascii: false

I believe that only the 3.1 result is correct, as "богус" first character is not ascii.

Updated by andrykonchin (Andrew Konchin) almost 2 years ago Actions #1 [ruby-core:116210]

I cannot reproduce the issue with plain String (without StringIO) on Ruby 3.2, 3.1 and 3.0. ascii_only? reports false for "богус":

ruby -e 'p "богус".ascii_only?'
false

I believe in the examples involving StringIO the observed behaviour is caused by preserving StringIO#string's encoding. StringIO instance is initialised with a String literal in binary encoding. And any modification like writing doesn't change encoding even when a UTF-8 String is written:

io = StringIO.new "".b
io.string.encoding # => #<Encoding:ASCII-8BIT>

io.write "汉"
io.string.encoding # => #<Encoding:ASCII-8BIT>

In case of the "богус" String literal there are bytes greater than 127 so they are treated as non-ASCII:

io = StringIO.new "".b
io.write "богус"
io.string.bytes # => [208, 177, 208, 190, 208, 179, 209, 131, 209, 129]

Updated by nobu (Nobuyoshi Nakada) almost 2 years ago Actions #2

  • Status changed from Open to Closed

Updated by chucke (Tiago Cardoso) almost 2 years ago Actions #3 [ruby-core:116234]

nobu, can I ask why was the ticket closed? Even considering the comment from andrykonchin, he clearly points oot at the end that there are bytes greater than 128 in the string (therefore .ascii_only? should be false).

Updated by jeremyevans0 (Jeremy Evans) almost 2 years ago Actions #4 [ruby-core:116235]

chucke (Tiago Cardoso) wrote in #note-3:

nobu, can I ask why was the ticket closed? Even considering the comment from andrykonchin, he clearly points oot at the end that there are bytes greater than 128 in the string (therefore .ascii_only? should be false).

This was fixed by 6283ae8d369bd2f8a022bb69bc5b742c58529dec

Updated by chucke (Tiago Cardoso) almost 2 years ago Actions #5 [ruby-core:116237]

Apologies everyone, got temporary redmine visual impairment. Thank you.

Updated by Eregon (Benoit Daloze) almost 2 years ago Actions #6 [ruby-core:116239]

Indeed on Redmine I see no link to the commit in https://bugs.ruby-lang.org/issues/20185?tab=history#note-2, it seems like a bug.

Updated by hsbt (Hiroshi SHIBATA) almost 2 years ago Actions #7 [ruby-core:116260]

No, Fix https://bugs.ruby-lang.org/issues/20185of commit message is not correct format for redmine autolink.

Actions

Also available in: PDF Atom