Bug #19192
openIO has third data mode, document is incomplete.
Description
The documentation on the mode parameter of File.open is incomplete, I would like to clarify IO's data mode actual behavior here.
document says
To specify whether data is to be treated as text or as binary data, either of the following may be suffixed to any of the string read/write modes above:
't': Text data; sets the default external encoding to Encoding::UTF_8; on Windows, enables conversion between EOL and CRLF and enables interpreting 0x1A as an end-of-file marker.
'b': Binary data; sets the default external encoding to Encoding::ASCII_8BIT; on Windows, suppresses conversion between EOL and CRLF and disables interpreting 0x1A as an end-of-file marker.
If neither is given, the stream defaults to text data.
But actually it's more complicated than that.
There is three Data Mode
- text mode. Can convert encoding and newline.
- binary mode. Cannot convert encoding nor newline. Encoding is treated as Encoding::ASCII_8BIT.
- third mode: DOS TEXT mode. That enables conversion between EOL and CRLF and enables interpreting 0x1A as an end-of-file marker.
On Windows platform
't' textmode with universal newline conversion.
'b' binary mode.
If neither is given, DOS TEXT mode.
On other platforms
't' textmode with universal newline conversion.
'b' binary mode.
If neither is given, textmode without newline conversion.
On Windows, there are some special cases.
If Encoding conversion is specified, DOS TEXT mode is ignored and universal newline conversion applied.
If access mode is "a+", last (only one) EOF charactor is overwritten when DOS TEXT mode.
There are more parameter combinations, see https://gist.github.com/YO4/262e9bd5e44a37a7a2fa9118e271b30b
Is this all? I have not fully investigated.
Since the topic of data mode spanned access mode and encoding conversion, I don't think my English skills will allow me to summarize this into rdoc without breaking something...