Bug #18407
closedBehavior difference between integer and string flags to File creation
Description
Hi!
I was under the impression that these two commands should either both work of both fail, however they behave differently.
$ ruby -ropen-uri -EUTF-8:UTF-8 -e 'f = File.new("foo", "wb"); f.write URI.open("https://rubygems.org/gems/rake-13.0.6.gem").read'
$ ruby -ropen-uri -EUTF-8:UTF-8 -e 'f = File.new("foo", File::WRONLY | File::TRUNC | File::BINARY); f.write URI.open("https://rubygems.org/gems/rake-13.0.6.gem").read'
-e:1:in `write': "\\x8B" from ASCII-8BIT to UTF-8 (Encoding::UndefinedConversionError)
from -e:1:in `<main>'
Could be an actual bug, and me misunderstanding the documentation. In any case it seemed worth reporting.
Updated by byroot (Jean Boussier) almost 3 years ago
Reduced test, without open-uri and without changing the default external encoding:
Encoding.default_internal = Encoding::UTF_8
f = File.new("/tmp/test.bin", File::CREAT | File::WRONLY | File::TRUNC | File::BINARY)
f.write "\xC8".force_encoding(Encoding::BINARY)
Updated by deivid (David Rodríguez) almost 3 years ago
Thank you @byroot (Jean Boussier).
Updated by byroot (Jean Boussier) almost 3 years ago
Digging just a little bit:
#ifdef O_BINARY
if (oflags & O_BINARY) {
fmode |= FMODE_BINMODE;
}
#endif
and:
>> File::BINARY
=> 0
In short File::BINARY
is noop on unixes, it's a windows only option, so Ruby defines it as 0
on these OS and basically does nothing.
The problem now is that to make it behave like b
, it would need to have another value than 0
, which could be a breaking change :/
Updated by nobu (Nobuyoshi Nakada) over 2 years ago
- Description updated (diff)
- Status changed from Open to Closed
IO::BINARY
is for O_BINARY
which comes from underlying runtimes, and unrelated to ruby encodings.
The second form is for specifying such flags in a fine-grained manner, so it needs an encoding explicitly unlike the shorthand "wb"
.
Updated by mame (Yusuke Endoh) over 2 years ago
@deivid (David Rodríguez) This should work
$ ruby -ropen-uri -EUTF-8:UTF-8 -e 'f = File.new("foo", File::WRONLY | File::TRUNC | File::BINARY, encoding: "BINARY"); f.write URI.open("https://rubygems.org/gems/rake-13.0.6.gem").read'
Updated by deivid (David Rodríguez) over 2 years ago
Thanks @mame (Yusuke Endoh)!
I still think at least the documentation should be updated to mention this, because the current wording makes me think the alternatives I tried should be equivalent and both work: https://ruby-doc.org/core-3.1.2/IO.html#method-c-new.
Updated by mame (Yusuke Endoh) over 2 years ago
deivid (David Rodríguez) wrote in #note-6:
I still think at least the documentation should be updated to mention this
Suggestions for improvement are of course welcome.
Note that the current document says that "b"
means "setting the encoding as binary and disabling line code conversion" and File::BINARY
means just "disabling line code conversion".
https://docs.ruby-lang.org/en/master/IO.html#class-IO-label-Data+Mode
'b': Binary data; sets the default external encoding to Encoding::ASCII_8BIT; on Windows, suppresses conversion between EOL and CRLF.
https://docs.ruby-lang.org/en/master/File/Constants.html#BINARY
BINARY
disable line code conversion
Updated by deivid (David Rodríguez) over 2 years ago
Thanks! The documentation seems much better now (master) than on 3.1, but I will try a PR to clarify a bit more!
Updated by deivid (David Rodríguez) over 2 years ago
I created https://github.com/ruby/ruby/pull/5923.
Updated by nobu (Nobuyoshi Nakada) over 2 years ago
BTW, why do you use File::
instead of IO::
?
Because the documents in io.c use the former?
Updated by deivid (David Rodríguez) over 2 years ago
I guess, yeah, and because I was dealing with opening a file, so File::
constants seemed appropriate to set the open mode, right?