On Fri, 2017-11-03 at 18:34 +0000, shevegen@gmail.com wrote:
Issue #14077 has been updated by shevegen (Robert A. Heiler).
I am in agreement with the feature-suggestion. Not sure whether
it should be a constant or a method or both but I agree that it
may be useful to have direct support for this in ruby.
...
Matz said several times that one (core?) part of ruby's philosophy
is the "human aspect" aka how something is used with ruby. I think
that this is also a reason why the ruby core team often likes
to see "real world use cases" to determine how/if something is
used.
Many of my cheesy ruby scripts manipulate directory hierarchies on both
windows and linux, often to fix problems that occur when you share an
NTFS-formatted external disk drive between systems.
This is one of the most frequent things that I have to do, since many
of my files (and some directories) use Korean UTF-8 characters:
Dir.entries('/.../mydir/',).each do |base|
I know that I must specify the encoding on Windows 7, or else it
assumes Windows-1252 and messes up multi-byte characters. This code
also works fine on Ubuntu 14, Fedora 24 and Debian 9, although I don't
even know what the default or filesystem encoding is on Linux systems.
FYI, on Windows 7, I work exclusively with NTFS and Fat32. On Linux, I
routinely work with EXT4, NTFS and Fat32. Are you aware that the NTFS
driver for Linux allows you to create filesystem objects with names
that are unworkable under Windows? [names with embedded colons : for
instance]
I had to go to the Internet to figure out that I needed to use
:encoding=>'UTF-8' to properly handle multi-byte characters on Windows
7. It would have been nice to have Ruby tell me what the default
encodings were. That's a lame reason for inclusion of this proposed
feature, but it's all I have at the moment.
In the past, I ran into another problem, where I found embedded text of
a character type different than the enclosing text. I find that even
today, in filenames and text that mix English, Japanese and Korean
texts into a single string or file. I blame word-processors for this
mess. I used to jump through hoops to handle the problem, then I got
smart and just forced the encoding to UTF-8, replacing bad characters
with ''. In this situation, I don't see how knowing the filesystem or
default encodings would help, since the person who created the
Frankenstein-text didn't realize what they were doing.