Project

General

Profile

Bug #15993

'require' doesn't work if there are Cyrillic chars in the path to Ruby dir

Added by inversion (Yura Babak) 3 months ago. Updated 2 months ago.

Status:
Open
Priority:
Normal
Assignee:
-
Target version:
-
ruby -v:
ruby 2.6.3p62 (2019-04-16 revision 67580) [x64-mingw32]
[ruby-core:93655]

Description

I’m trying to build a cross-platform portable application with Ruby onboard and there is a problem on Windows.
A user usually installs it to the Roaming folder which sits inside a user folder which can often have not a Latin name or contain spaces).
When there is a Cyrillic character (maybe just not Latin) in the path — require of any gem doesn’t work:

D:\users\киї\Ruby\2.6\bin>ruby -v
ruby 2.6.3p62 (2019-04-16 revision 67580) [x64-mingw32]

D:\users\киї\Ruby\2.6\bin>ruby -e "require 'logger'"
Traceback (most recent call last):
        1: from <internal:gem_prelude>:2:in `<internal:gem_prelude>'
<internal:gem_prelude>:2:in `require': No such file or directory -- D:/users/РєРёС—/Ruby/2.6/lib/ruby/2.6.0/rubygems.rb (LoadError)

D:\users\киї\Ruby\2.6\bin>ruby --disable=rubyopt -e "require 'logger'"
Traceback (most recent call last):
        1: from <internal:gem_prelude>:2:in `<internal:gem_prelude>'
<internal:gem_prelude>:2:in `require': No such file or directory -- D:/users/РєРёС—/Ruby/2.6/lib/ruby/2.6.0/rubygems.rb (LoadError)

D:\users\киї\Ruby\2.6\bin>gem list
Traceback (most recent call last):
        1: from <internal:gem_prelude>:2:in `<internal:gem_prelude>'
<internal:gem_prelude>:2:in `require': No such file or directory -- D:/users/РєРёС—/Ruby/2.6/lib/ruby/2.6.0/rubygems.rb (LoadError)

We can see such encoding transformations in the output:

киї (utf-8) == РєРёС— (win1251)

I have an old Ruby installation that works fine:

D:\users\киї\Ruby\2.0\bin>ruby -e "require 'logger'"

D:\users\киї\Ruby\2.0\bin>ruby -v
ruby 2.0.0p451 (2014-02-24) [i386-mingw32]

The same is for ruby 2.0.0p643 (2015-02-25) [i386-mingw32] .

I also checked that require fails in the same case for
ruby 2.1.9p490 (2016-03-30 revision 54437) [i386-mingw32]

History

Updated by inversion (Yura Babak) 3 months ago

Looks like there is an ugly workaround.

1) Ensure to do chcp 1251 in the current console session.
2) Run Ruby with an option --disable=gems so it will not fail initially.
3) Add next code at the very beginning of a script:

if $:[0].encoding.name == 'Windows-1251'
    $:.each {|path| path.encode! 'UTF-8' }
    $:.push '.'    # somehow it helps, looks like a modification of array is needed
    require 'rubygems'
end

This helped me to overcome the problem and run my script from a folder with Cyrillic and spaces in the path.

But it definitely should be fixed.

Updated by duerst (Martin Dürst) 3 months ago

ko1 (Koichi Sasada): I can check whether this bug is reproducible. But I'm not too familiar with how Ruby deals with the Windows file system. So I'm not confident I will be able to find and fix this bug.

Updated by MSP-Greg (Greg L) 2 months ago

On a US Windows system, I used a base Ruby folder of C:\Greg\Ruby киї (using a space and Cyrillic characters), and I could repo the issue.

Without any console chcp command, I did the following, which also solved the issue:

# start ruby with --disable=gems
$:.map! { |path| path.dup.force_encoding 'UTF-8' }
require 'rubygems'

require 'openssl'
puts OpenSSL::VERSION

I don't think spaces in Windows paths is an issue anymore, but I haven't rigorously checked...

Updated by MSP-Greg (Greg L) 2 months ago

While taking a break, looked at this again. Below is the encoding of various items:

$LOAD_PATH
ASCII-8BIT      C:/Greg/Ruby киї/lib/ruby/site_ruby/2.7.0
ASCII-8BIT      C:/Greg/Ruby киї/lib/ruby/site_ruby/2.7.0/x64-msvcrt
ASCII-8BIT      C:/Greg/Ruby киї/lib/ruby/site_ruby
ASCII-8BIT      C:/Greg/Ruby киї/lib/ruby/vendor_ruby/2.7.0
ASCII-8BIT      C:/Greg/Ruby киї/lib/ruby/vendor_ruby/2.7.0/x64-msvcrt
ASCII-8BIT      C:/Greg/Ruby киї/lib/ruby/vendor_ruby
ASCII-8BIT      C:/Greg/Ruby киї/lib/ruby/2.7.0
ASCII-8BIT      C:/Greg/Ruby киї/lib/ruby/2.7.0/x64-mingw32

IBM437          __FILE__
IBM437          __dir__
UTF-8           Dir.pwd

The encoding wasn't affected by using -E in RUBYOPT.

Tested using today's trunk.

Also available in: Atom PDF