Bug #1689 [ruby-core:24029]
ARGF.binmode Affects Encoding Inconsistently
| Status : | Closed | Start : | 06/25/2009 | |
| Priority : | Low | Due date : | ||
| Assigned to : | - | % Done : | 100% |
|
| Category : | M17N | |||
| Target version : | - | |||
| ruby -v : | ruby 1.9.2dev (2009-06-21 trunk 23774) [i686-linux] |
Description
The IO#binmode documentation promises that "content is treated as ASCII-8BIT". I assumed this would apply to ARGF, too. It sometimes does:
$ echo "a" | ruby -ve 'ARGF.binmode; p ARGF.readpartial(1).encoding'
ruby 1.9.2dev (2009-06-21 trunk 23774) [i686-linux]
#<Encoding:ASCII-8BIT>
But often doesn't:
$ echo "a" | ruby -ve 'ARGF.binmode; p ARGF.read.encoding'
ruby 1.9.2dev (2009-06-21 trunk 23774) [i686-linux]
#<Encoding:UTF-8>
$ echo "a" | ruby -ve 'ARGF.binmode; p ARGF.getc.encoding'
ruby 1.9.2dev (2009-06-21 trunk 23774) [i686-linux]
#<Encoding:UTF-8>
$ echo "a" | ruby -ve 'ARGF.binmode; p ARGF.readchar.encoding'
ruby 1.9.2dev (2009-06-21 trunk 23774) [i686-linux]
#<Encoding:UTF-8>
$ ruby -ve 'ARGF.binmode; p ARGF.read.encoding' /usr/bin/ruby
ruby 1.9.2dev (2009-06-21 trunk 23774) [i686-linux]
#<Encoding:UTF-8>
I had assumed that ARGF.binmode would set the encoding of all files read in to ASCII-8BIT. This is how, for instance, File works: File.binmode sets the output of File.read to ASCII-8BIT, even when the contents is entirely ASCII. Setting the default external encoding to ASCII-8BIT fixes these cases.
So. given that I'm trying to document ARGF, my questions are:
1) Why the inconsistency between ARGF.readpartial and the rest of the ARGF methods?
2) Should the default external encoding take precedence over 'binmode'?
Associated revisions
- io.c (argf_binmode_m): should call rb_io_ascii8bit_binmode() to
set its encoding to ASCII-8BIT.
[ruby-core:24029]