Project

General

Profile

Actions

Feature #3418

closed

IO#putc Clobbers Multi-byte Characters

Added by runpaint (Run Paint Run Run) over 14 years ago. Updated over 13 years ago.

Status:
Closed
Target version:
[ruby-core:30697]

Description

=begin
IO#putc claims to write a "character", when in fact it writes a byte. I assume this is for backward compatibility reasons, but as this could lead to data loss, the documentation needs clarifying. Currently, #putc doesn't require the stream to be in binmode, provide any warning of the truncation, or agree with IO#getc on the definition of "character".

open('/tmp/putc', 'w+') {|f| f.putc "\u1234"; f.rewind; f.read}
#=> "\xE1

open('/tmp/getc', 'w+'){|f| f.print "\u1234"; f.rewind; f.getc}
#=> "ሴ"

If the IO stream explicitly specifies a non-BINARY encoding, the first example fails with an Encoding::UndefinedConversionError, which is reasonable.

open('/tmp/putc', 'w+:UTF-8'){|f| f.putc "\u1234"; f.rewind; f.read}
#=> Encoding::UndefinedConversionError: "\xE1" from ASCII-8BIT to UTF-8
=end


Files

io.c-putc.patch (1.25 KB) io.c-putc.patch runpaint (Run Paint Run Run), 06/10/2010 07:15 AM
io.c-putc.patch (1.1 KB) io.c-putc.patch runpaint (Run Paint Run Run), 06/10/2010 07:18 AM
Actions #1

Updated by matz (Yukihiro Matsumoto) over 14 years ago

=begin
Hi,

In message "Re: [ruby-core:30697] [Bug #3418] IO#putc Clobbers Multi-byte Characters"
on Thu, 10 Jun 2010 05:49:55 +0900, Run Paint Run Run writes:

|IO#putc claims to write a "character", when in fact it writes a byte. I assume this is for backward compatibility reasons, but as this could lead to data loss, the documentation needs clarifying.

Agreed. The behavior is intentional, the term "character" in the
documentation means a byte in 8bit ascii, not to apart from old
putc(3) function in the C library. So this one is a documentation bug
at most.

						matz.

=end

Actions #2

Updated by runpaint (Run Paint Run Run) over 14 years ago

=begin
Thanks. Patch attached.
=end

Actions #3

Updated by runpaint (Run Paint Run Run) over 14 years ago

=begin
Drat. Wrong file; try this one.
=end

Actions #4

Updated by matz (Yukihiro Matsumoto) over 14 years ago

=begin
Hi,

In message "Re: [ruby-core:30701] [Bug #3418] IO#putc Clobbers Multi-byte Characters"
on Thu, 10 Jun 2010 07:18:58 +0900, Run Paint Run Run writes:

|File io.c-putc.patch added

Thank you for the patch. I will apply the patch, except for examples
for multi-byte characters, since I want to make it implementation
detail.

						matz.

=end

Actions #5

Updated by matz (Yukihiro Matsumoto) over 14 years ago

  • Status changed from Open to Closed
  • % Done changed from 0 to 100

=begin
This issue was solved with changeset r28243.
Run Paint, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.

=end

Actions #6

Updated by mame (Yusuke Endoh) over 14 years ago

  • Status changed from Closed to Open
  • Assignee set to naruse (Yui NARUSE)

=begin
Hi,

I agree that this is an implementation detail, but I also expect IO#putc
to handle normal character, because IO#getc behaves so:

$ cat t.txt
あいうえお

$ ruby19 -e 'open("t.txt") {|f| p f.getc }'
"あ"

$ ruby19 -e 'open("t.txt", "w") {|f| f.putc ?あ }'

$ ruby19 -e 'open("t.txt") {|f| p f.read }'
"\xE3"

IO#putbyte would be needed for the byte-oriented purpose.
I move this ticket to 1.9.x feature request.

--
Yusuke Endoh
=end

Actions #7

Updated by shyouhei (Shyouhei Urabe) about 14 years ago

  • Status changed from Open to Assigned

=begin

=end

Actions #8

Updated by naruse (Yui NARUSE) about 14 years ago

  • Status changed from Assigned to Closed

=begin
This issue was solved with changeset r29447.
Run Paint, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.

=end

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0Like0