Bug #16970
closedEncoding of ENV value returns ASCII-8BIT in Ruby2.6 or later
Description
Problem Report¶
When internal encoding is set to UTF-8, encoding of ENV value always return ASCII-8BIT. (UTF-8 is expected).
Result of reproduce process¶
Ruby 2.5
set TEST=日本
ruby --encoding=UTF-8:UTF-8 -e "p 'test'.encoding"       #=> #<Encoding:UTF-8>
ruby --encoding=UTF-8:UTF-8 -e "p ENV['TEST'].encoding"  #=> #<Encoding:UTF-8>
Ruby 2.6
set TEST=日本
ruby --encoding=UTF-8:UTF-8 -e "p 'test'.encoding"       #=> #<Encoding:UTF-8>
ruby --encoding=UTF-8:UTF-8 -e "p ENV['TEST'].encoding"  #=> #<Encoding:ASCII-8BIT>    **INVALID RESULT** Expected Result is UTF-8.
ENV['TEST'] =~ /日本(nonascii)/  # => incompatible encoding regexp match (UTF-8 regexp with ASCII-8BIT string)    
Regression occurred in the following Ticket¶
fallback env encoding to ASCII-8BIT
https://github.com/ruby/ruby/commit/7f0d337be73bb2465b40009fe23f3b7be6b0dc90
Reason of bug¶
rb_str_cat_conv_enc_opts return Qnil when from/to encoding is the same (UTF-8) and fallback code
introduced by the following commit set the encoding to ASCII-8BIT.
https://github.com/ruby/ruby/commit/7f0d337be73bb2465b40009fe23f3b7be6b0dc90
Fixes¶
When internel encoding is UTF-8, return the string as is since there is no need to call encoding conversion.
https://github.com/ruby/ruby/pull/3239
        
           Updated by masuyama (Soichi Masuyama) over 5 years ago
          Updated by masuyama (Soichi Masuyama) over 5 years ago
          
          
        
        
      
      - Description updated (diff)
        
           Updated by masuyama (Soichi Masuyama) over 5 years ago
          Updated by masuyama (Soichi Masuyama) over 5 years ago
          
          
        
        
      
      - Description updated (diff)
        
           Updated by masuyama (Soichi Masuyama) over 5 years ago
          Updated by masuyama (Soichi Masuyama) over 5 years ago
          
          
        
        
      
      - Description updated (diff)
        
           Updated by jeremyevans0 (Jeremy Evans) about 5 years ago
          Updated by jeremyevans0 (Jeremy Evans) about 5 years ago
          
          
        
        
      
      - Related to Bug #16623: Windows ENV encoding added
        
           Updated by jeremyevans0 (Jeremy Evans) over 4 years ago
          Updated by jeremyevans0 (Jeremy Evans) over 4 years ago
          
          
        
        
      
      - Status changed from Open to Closed
Ruby 3.0 uses UTF-8 for ENV values on Windows by default, even if the code page is not UTF-8. So I think this and #16623 can be closed.