Feature #19193
opendrop DOS TEXT mode support
Description
On Windows platform, File.open(path, "r")
returns an object different from "rt" and "rb". I call that DOS TEXT mode here.
DOS TEXT mode does
- crlf conversion
- 0x1a treated EOF charactor on read
and others (see Bug #19192).
But DOS TEXT mode is almost unnecessary today and it seems to introduce lot of code complexities.
Now there is less need for dos text mode
- Microsoft's most apps works without CRLF newline.
- Creating a crlf text file today should be explicit. (but that is default mode on windows now)
- Interpreting EOF charactor can cause trouble.
I think it's time to consider dropping DOS TEXT mode.
What challenges are there and what preparation is needed?
Updated by mame (Yusuke Endoh) almost 2 years ago
- Related to Bug #18882: File.read cuts off a text file with special characters when reading it on MS Windows added
Updated by nobu (Nobuyoshi Nakada) almost 2 years ago
- Status changed from Open to Assigned
- Assignee set to usa (Usaku NAKAMURA)
YO4 (Yoshinao Muramatsu) wrote:
- Microsoft's most apps works without CRLF newline.
I guess you mean those apps can read without CR, but is it the default to write with LF newlines?
- Creating a crlf text file today should be explicit. (but that is default mode on windows now)
This is not related to reading, but writing.
I think it's time to consider dropping DOS TEXT mode.
What challenges are there and what preparation is needed?
The most important reason we are keeping "text mode" at reading is backward interoperability for old files.
How do you think, @usa (Usaku NAKAMURA)?
Are many of text files using LF newlines nowadays on Windows?
Updated by usa (Usaku NAKAMURA) almost 2 years ago
The most important reason we are keeping "text mode" at reading is backward interoperability for old files.
agree.
Are many of text files using LF newlines nowadays on Windows?
I don't think so.
For example, Excel 2021 outputs CRLF newlines when creating a CSV file.
Reading as text mode and writing as binary mode by default seems to be a reasonable choice.
But is the complexity acceptable by users?
Updated by YO4 (Yoshinao Muramatsu) almost 2 years ago
The most important reason we are keeping "text mode" at reading is backward interoperability for old files.
agree too.
Interoperability with the unix environment is becoming increasingly important.
So, when writing CRLF newline it is better to specify it explicitly.