Bug #5637: warnings of shellescape - Ruby - Ruby Issue Tracking System

Actions

Copy link

Bug #5637

closed

warnings of shellescape

Added by znz (Kazuhiro NISHIYAMA) over 13 years ago. Updated almost 13 years ago.

Status:

Closed

Assignee:

knu (Akinori MUSHA)

Target version:

2.0.0

ruby -v:

Backport:

[ruby-dev:44878]

Description

\あ

Actions

Copy link

#1 [ruby-dev:44879]

Updated by znz (Kazuhiro NISHIYAMA) over 13 years ago

ruby -v changed from ruby 2.0.0dev (2011-11-15 trunk 33753) [x86_64-linux] to -

西山和広です。

redmine の方で書くと消えてしまうようなので、メールで書き直します。

Shellwords.shellescape で警告が出ます。

% ./ruby -v -r shellwords -e 'p Shellwords.shellescape("\u3042")'
ruby 2.0.0dev (2011-11-15 trunk 33753) [x86_64-linux]
/home/chkbuild/tmp/build/ruby-trunk/20111114T222552Z/lib/ruby/1.9.1/shellwords.rb:86: warning: regexp match /.../n against to UTF-8 string
/home/chkbuild/tmp/build/ruby-trunk/20111114T222552Z/lib/ruby/1.9.1/shellwords.rb:86: warning: regexp match /.../n against to UTF-8 string
/home/chkbuild/tmp/build/ruby-trunk/20111114T222552Z/lib/ruby/1.9.1/shellwords.rb:86: warning: regexp match /.../n against to UTF-8 string
"\あ"

エスケープ結果も変だと思います。
エスケープ結果を 1.8.7 にあわせるのなら以下のパッチで
どうでしょうか。

diff --git a/lib/shellwords.rb b/lib/shellwords.rb
index 5d6ba75..78331a7 100644
--- a/lib/shellwords.rb
+++ b/lib/shellwords.rb
@@ -79,11 +79,11 @@ module Shellwords
# An empty argument will be skipped, so return empty quotes.
return "''" if str.empty?

str = str.dup

str = str.dup.force_encoding("ASCII-8BIT")

Process as a single byte sequence because not all shell¶

implementations are multibyte aware.¶

str.gsub!(/([^A-Za-z0-9_\-.,:\/@\n])/n, "\\\1")

str.gsub!(/([^A-Za-z0-9_\-.,:\/@\n])/, "\\\1")

A LF cannot be escaped with a backslash because a backslash + LF¶

combo is regarded as line continuation and simply ignored.¶

diff --git a/test/test_shellwords.rb b/test/test_shellwords.rb
index d48a888..cbc5043 100644
--- a/test/test_shellwords.rb
+++ b/test/test_shellwords.rb
@@ -36,4 +36,8 @@ class TestShellwords < Test::Unit::TestCase
shellwords(bad_cmd)
end
end
+

def test_shellescape_utf8_string
assert_equal "\\343\\201\\202", shellescape("\u3042")
end
end

--
|ZnZ(ゼットエヌゼット)
|西山和広(Kazuhiro NISHIYAMA)

Actions

Copy link

#2 [ruby-dev:44883]

Updated by knu (Akinori MUSHA) over 13 years ago

Assignee set to knu (Akinori MUSHA)

Actions

Copy link

Updated by knu (Akinori MUSHA) over 13 years ago

いろいろ考えたんですが、単に //n フラグを削るだけにしようと思います。

・1.8: 一律バイナリとして扱うのは、文字列にencoding情報がなく$KCODEもあてにならないため、やむを得ない仕様だった。（この事情は1.9+には当てはまらない）
・1.9: 1.9.3の今までずっとこの挙動だった。警告はバグ（//nの修正漏れ）として消すが、挙動については非互換を招くので変えない。
・2.0: 文字列の使い道（渡すシェルのlocaleなど）を知っているのは呼出元だけだが、1.9+では呼出元がASCII-8BITも含め適切にencodeすることができるので、shellescapeがそのencodingを尊重する現在の挙動こそ（たまたまだが）望ましく、変える必要はない。

警告の出しようもない（SJISなら云々とかもシェルのlocaleをSJISにするなど分かってやっている場合は害）ので、余計なことはせず、ドキュメントにだけ注記するつもりです。

Actions

Copy link

Updated by knu (Akinori MUSHA) over 13 years ago

Status changed from Open to Closed
% Done changed from 0 to 100

This issue was solved with changeset r34166.
Kazuhiro, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.

lib/shellwords.rb (Shellwords#shellescape): Drop the //n flag
that only causes warnings with no real effect. [Bug #5637]

Actions

Copy link

#5 [ruby-dev:46033]

Updated by dariocravero (Darío Cravero) almost 13 years ago

Hi,

Thanks for this patch!.. :)

One question though, from comment #3 it's not clear if it's safe to use it in 1.9.3. This is what Google Translator gave me:

"1.9: this behavior was all the way to 1.9.3 now. Turn off warning but does not change as a bug (missing fix of / / n), because the behavior leads to incompatibility."

However, I've applied it and, as expected, I don't see the warning anymore. Still, can you just confirm there're no side effects to this on 1.9.3?

Thanks a million!..

Actions

Copy link

#6 [ruby-dev:46034]

Updated by knu (Akinori MUSHA) almost 13 years ago

As I documented, it's all up to how you use the resulted string.

If you are going to pass it to a shell that lacks support for the encoding of the string, then you should probably encode the original string in ASCII-8BIT before shell-escaping with shellescape() to get a byte-by-byte escape to make sure the shell won't find a metacharacter inside a multibyte character.

UTF-8 multibyte characters do not contain any ASCII character by design anyway, so most people in the everything-is-UTF-8 world don't even have to care about this.

But, for example, when you have to run a program passing a Shift_JIS string via a shell under a non-Shift_JIS locale, you'd probably have to compose the command line in the ASCII-8BIT encoding so that all shell metacharacters that may appear in Shift_JIS multibyte characters are properly escaped.

Actions

Copy link

Also available in: Atom PDF

Like0

Like0Like0Like0Like0Like0Like0

Project

General

Profile

Ruby

Tags

Custom queries

Bug #5637

warnings of shellescape

Updated by znz (Kazuhiro NISHIYAMA) over 13 years ago

Process as a single byte sequence because not all shell¶

implementations are multibyte aware.¶

A LF cannot be escaped with a backslash because a backslash + LF¶

combo is regarded as line continuation and simply ignored.¶

Updated by knu (Akinori MUSHA) over 13 years ago

Updated by knu (Akinori MUSHA) over 13 years ago

Updated by knu (Akinori MUSHA) over 13 years ago

Updated by dariocravero (Darío Cravero) almost 13 years ago

Updated by knu (Akinori MUSHA) almost 13 years ago