Project

General

Profile

Bug #5297

Either File.expand_path or File.join is corrupting string encoding

Added by luislavena (Luis Lavena) almost 9 years ago. Updated over 8 years ago.

Status:
Closed
Priority:
Normal
Target version:
ruby -v:
ruby 1.9.4dev (2011-09-07 trunk 33212) [i386-mingw32]
Backport:
[ruby-core:39355]

Description

Hello,

While working on some API improvements for Windows, found the following issue:

https://gist.github.com/1202366

V:\fóñè>ruby -v
ruby 1.9.4dev (2011-09-07 trunk 33212) [i386-mingw32]

V:\fóñè>chcp 1252
Active code page: 1252

V:\fóñè>ruby -e "puts Encoding.default_external"
Windows-1252

V:\fóñè>irb
irb(main):001:0> a = File.expand_path "."
=> "V:/fóñè"
irb(main):002:0> a.encoding
=> #Encoding:Windows-1252
irb(main):003:0> b = Dir.glob("../*").first
=> "../fóñè"
irb(main):004:0> b.encoding
=> #Encoding:Windows-1252
irb(main):005:0> File.expand_path b
=> "V:/fóñè"
irb(main):006:0> c = File.expand_path b
=> "V:/fóñè"
irb(main):007:0> c.encoding
=> #Encoding:Windows-1252
irb(main):008:0> d = File.join(a, "foo")
=> "V:/f\xF3\xF1\xE8/foo"
irb(main):009:0> d.encoding
=> #Encoding:ASCII-8BIT # <= FUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUU
irb(main):010:0> e = "#{a}/foo"
=> "V:/fóñè/foo"
irb(main):011:0> e.encoding
=> #Encoding:Windows-1252
irb(main):012:0> File.open(d, "w+") { |f| f.puts "hi" }
Errno::ENOENT: No such file or directory - V:/fóñè/foo # <= W.T.F.????
from (irb):12:in initialize'
from (irb):12:in
open'
from (irb):12
from C:/Users/Luis/Tools/Ruby/ruby-head-i386-mingw32/bin/irb:12:in <main>'
irb(main):013:0> File.open(e, "w+") { |f| f.puts "hi" }
Errno::ENOENT: No such file or directory - V:/fóñè/foo # <= W.T.F. * 20!
from (irb):13:in
initialize'
from (irb):13:in open'
from (irb):13
from C:/Users/Luis/Tools/Ruby/ruby-head-i386-mingw32/bin/irb:12:in
'
irb(main):014:0>

It is not clear why while File.expand_path worked, File.join broke but string interpolation didn't.

Even worse is that File.open failed.

I'm working on a replacement function for expand_path that rely on MultiByteToWideChar + GetFullPathNameW + WideCharToMultiByte and then uses rb_filesystem_str_new_cstr to return the string.

The funny fact is that replacement work properly:

C:\Users\Luis\Projects\oss\me\fenix>ripl -Ilib

require "fenix"
=> true
Dir.chdir "V:"
=> 0
Dir.pwd
=> "V:/fóñè"
c = Fenix::File.expand_path "."
=> "V:/fóñè"
c.encoding
=> #Encoding:Windows-1252
File.join(c, "foo").encoding
=> #Encoding:Windows-1252
d = "#{c}/foo"
=> "V:/fóñè/foo"
d.encoding
=> #Encoding:Windows-1252
File.open(d, "w") { |f| f.puts "hi" }
=> nil

Also available in: Atom PDF