Incorrect encoding for ENV in Windows
ENV and if it contains non-ASCII - string won't have correct encoding.
In Ruby 2.0 we can force it to UTF8 (it doesn't matter what's windows encoding nor consoles) and it will be correct, but in Ruby 1.9 there's no way to correctly read it.
Writing non-ASCII string to ENV is not possible at all neither of versions.
Also Ruby1.9 fails to read ENV with name witch contains non-ASCII
Here's test.rb script (basically set environment variable outside of ruby and in ruby print it out)
Seems it wasn't properly fixed in #5570
Updated by usa (Usaku NAKAMURA) over 8 years ago
Since Ruby 1.8 assumes the encoding of ENV is locale (or -K specified encoding),
Ruby 1.9 also treats it as locale for compatibility.
It was intentional decision, not bug.
We were able to break compatibility at Ruby 2.0, but the work was not done.
BTW, to be sure, the present behavior of Ruby 2.0 is wrong.
It should be corrected.
Updated by nobu (Nobuyoshi Nakada) over 6 years ago
- Status changed from Assigned to Closed
Applied in changeset r52896.
hash.c: env encoding fallback on Windows
- hash.c (env_str_new, env_path_str_new): make default string
UTF-8 for the case conversion is not possible. [Bug #8822]
- hash.c (get_env_cstr): convert non-ASCII string to UTF-8 string.
- hash.c (ruby_setenv): use wide char version to put environment
variable to deal with non-ASCII value.
Updated by Iristyle (Ethan Brown) almost 6 years ago
- Backport deleted (
1.9.3: UNKNOWN, 2.0.0: UNKNOWN)
I don't believe this is properly fixed.
I just left a comment at https://bugs.ruby-lang.org/issues/9715#note-5, and will leave the same comment here:
The expectation is that regardless of current locale / codepage, I should get UTF-8 strings when using
ENV on Windows. Here is a simple reproduction of the failure on
C:\Users\Administrator> $env:unicode = 'taskᚠᛇᚻ' C:\Users\Administrator> dir Env:\unicode Name Value ---- ----- unicode taskᚠᛇᚻ C:\Users\Administrator> ruby --version ruby 2.3.0p0 (2015-12-25 revision 53290) [x64-mingw32] C:\Users\Administrator> chcp Active code page: 437 C:\Users\Administrator> irb irb(main):001:0> RUBY_VERSION => "2.3.0" irb(main):002:0> Encoding.default_internal => nil irb(main):003:0> Encoding.default_external => #<Encoding:IBM437> irb(main):004:0> str = ENV['unicode'] => "task???" irb(main):005:0> str.encoding => #<Encoding:IBM437>
Again, when I access
ENV on Windows, I should receive a UTF-8 string with the correct data, not a
IBM437 string. The expected string in this case is:
irb(main):036:0> str2 = "task\u16A0\u16C7\u16BB" => "task\u16A0\u16C7\u16BB" irb(main):037:0> str2.encoding => #<Encoding:UTF-8>
Note that some browsers, like Chrome on OSX, may fail to render the Rune characters correctly, but if you copy into a proper editor or use another browser you should see the characters fine.