ENV encoding not UTF-8 by default
$ irb 2.5.1 :001 > 'secret'.encoding => #<Encoding:UTF-8> 2.5.1 :002 > ENV['PASS'] = 'secret'; ENV['PASS'].encoding => #<Encoding:US-ASCII> 2.5.1 :009 > ENV['PASS'] = 'Ł' => "\u0141" 2.5.1 :010 > ENV['PASS'].encoding => #<Encoding:ASCII-8BIT>
I would expect all encodings to be UTF-8 at all times
Updated by shevegen (Robert A. Heiler) almost 2 years ago
If I put this into a .rb file:
puts 'secret'.encoding ENV['PASS'] = 'secret' puts ENV['PASS'].encoding
On my system I get these two Strings output:
My environment is, aka my current locale, iso-8859-1, so the results that
I get seem correct. I can change the UTF-8 default encoding if I use a
shebang line in the .rb file, which I normally do, so all my encodings are
the same (ISO-8859-1; regexes used to behave a bit oddly sometimes but I
am not sure if that has changed or not).
I think ENV behaves a litle bit differently upon an
If I use a shebang line in a .rb file that includes the above unicode
character (this weird L), then all string encodings in that .rb file
are also ISO-8859-1, so I am not sure if there is any bug at all.
It may be more related to IRB perhaps? I skipped testing on IRB mostly
because .rb files have a "higher weight" than code put through IRB.
The documentation does not mention what happens with encodings when
these are assigned to an ENV key, though:
Perhaps it has more to do with IRB, in which case it could be added
And of course it may be that there is indeed a bug. You can try to
test with a standalone .rb file though and, if necessary, with a
specific shebang comment.
Updated by naruse (Yui NARUSE) 7 months ago
The assigned value to
ENV are stored in the process's environment variable.
The encoding of
ENV[key] is set as locale.
You can get the locale encoding by
Encoding.find("locale") which is decided based on
Encoding.locale_charmap which is affected by
ENV["PATH"] is returned as filesystem encoding but it is the same as locale encoding on Unix.