Feature #3418
closed
IO#putc Clobbers Multi-byte Characters
Added by runpaint (Run Paint Run Run) over 14 years ago.
Updated over 13 years ago.
Description
=begin
IO#putc claims to write a "character", when in fact it writes a byte. I assume this is for backward compatibility reasons, but as this could lead to data loss, the documentation needs clarifying. Currently, #putc doesn't require the stream to be in binmode, provide any warning of the truncation, or agree with IO#getc on the definition of "character".
open('/tmp/putc', 'w+') {|f| f.putc "\u1234"; f.rewind; f.read}
#=> "\xE1
open('/tmp/getc', 'w+'){|f| f.print "\u1234"; f.rewind; f.getc}
#=> "ሴ"
If the IO stream explicitly specifies a non-BINARY encoding, the first example fails with an Encoding::UndefinedConversionError, which is reasonable.
open('/tmp/putc', 'w+:UTF-8'){|f| f.putc "\u1234"; f.rewind; f.read}
#=> Encoding::UndefinedConversionError: "\xE1" from ASCII-8BIT to UTF-8
=end
Files
=begin
Hi,
In message "Re: [ruby-core:30697] [Bug #3418] IO#putc Clobbers Multi-byte Characters"
on Thu, 10 Jun 2010 05:49:55 +0900, Run Paint Run Run redmine@ruby-lang.org writes:
|IO#putc claims to write a "character", when in fact it writes a byte. I assume this is for backward compatibility reasons, but as this could lead to data loss, the documentation needs clarifying.
Agreed. The behavior is intentional, the term "character" in the
documentation means a byte in 8bit ascii, not to apart from old
putc(3) function in the C library. So this one is a documentation bug
at most.
matz.
=end
=begin
Thanks. Patch attached.
=end
=begin
Drat. Wrong file; try this one.
=end
=begin
Hi,
In message "Re: [ruby-core:30701] [Bug #3418] IO#putc Clobbers Multi-byte Characters"
on Thu, 10 Jun 2010 07:18:58 +0900, Run Paint Run Run redmine@ruby-lang.org writes:
|File io.c-putc.patch added
Thank you for the patch. I will apply the patch, except for examples
for multi-byte characters, since I want to make it implementation
detail.
matz.
=end
- Status changed from Open to Closed
- % Done changed from 0 to 100
=begin
This issue was solved with changeset r28243.
Run Paint, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.
=end
- Status changed from Closed to Open
- Assignee set to naruse (Yui NARUSE)
=begin
Hi,
I agree that this is an implementation detail, but I also expect IO#putc
to handle normal character, because IO#getc behaves so:
$ cat t.txt
あいうえお
$ ruby19 -e 'open("t.txt") {|f| p f.getc }'
"あ"
$ ruby19 -e 'open("t.txt", "w") {|f| f.putc ?あ }'
$ ruby19 -e 'open("t.txt") {|f| p f.read }'
"\xE3"
IO#putbyte would be needed for the byte-oriented purpose.
I move this ticket to 1.9.x feature request.
--
Yusuke Endoh mame@tsg.ne.jp
=end
- Status changed from Open to Assigned
- Status changed from Assigned to Closed
=begin
This issue was solved with changeset r29447.
Run Paint, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.
=end
Also available in: Atom
PDF
Like0
Like0Like0Like0Like0Like0Like0Like0Like0