Bug #8342
closed
IO.readlines ignores Encoding.default_internal if Encoding.default_external is ASCII-8BIT
Added by leocassarani (Leo Cassarani) over 11 years ago.
Updated over 11 years ago.
Description
Under normal circumstances, IO.readlines will transcode from Encoding.default_external to Encoding.default_internal:
File.open('hi', 'w') { |f| f.puts "hello\n" }
Encoding.default_external = Encoding::US_ASCII
Encoding.default_internal = Encoding::UTF_8
puts IO.readlines('hi').first.encoding
#=> UTF-8
However, when Encoding.default_external is set to ASCII-8BIT, IO.readlines will always use ASCII-8BIT, regardless of what Encoding.default_internal is set to:
File.open('hi', 'w') { |f| f.puts "hello\n" }
Encoding.default_external = Encoding::ASCII_8BIT
Encoding.default_internal = Encoding::UTF_8
puts IO.readlines('hi').first.encoding
#=> ASCII-8BIT
Using IO#gets instead of IO.readlines will produce the same behaviour.
- Category set to M17N
- Status changed from Open to Assigned
- Assignee set to naruse (Yui NARUSE)
- Target version set to 2.1.0
Seems intended behavior to me.
- Status changed from Assigned to Rejected
If external encoding is ASCII-8BIT, the input content is considered as binary.
It is out of text encoding conversion and its encoding kept as ASCII-8BIT even if default_internal is set.
Thanks naruse. However, this seems inconsistent with the way encodings are handled for individual IO instances. For example:
io = File.open('hi', :encoding => "ascii-8bit:utf-16")
puts io.gets.encoding
=> UTF-16¶
This happens even if Encoding.default_external is set to ASCII-8BIT before opening the file.
- Status changed from Rejected to Assigned
leocassarani (Leo Cassarani) wrote:
Thanks naruse. However, this seems inconsistent with the way encodings are handled for individual IO instances. For example:
io = File.open('hi', :encoding => "ascii-8bit:utf-16")
puts io.gets.encoding
=> UTF-16¶
This happens even if Encoding.default_external is set to ASCII-8BIT before opening the file.
That side sounds buggy
- Status changed from Assigned to Closed
- % Done changed from 0 to 100
This issue was solved with changeset r40610.
Leo, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.
- io.c (rb_io_ext_int_to_encs): ignore internal encoding if external
encoding is ASCII-8BIT. [Bug #8342]
Also available in: Atom
PDF
Like0
Like0Like0Like0Like0Like0