Feature #8258
closedDir#escape_glob
Added by steveklabnik (Steve Klabnik) about 11 years ago. Updated over 2 years ago.
Description
This is inspired by https://github.com/rails/rails/issues/6010.
Basically, if you do a Dir.glob in a directory whose name contains a glob character, things break. It would be nice to have a method which would escape the input so that we can Dir.glob inside of those directories.
Updated by rkh (Konstantin Haase) about 11 years ago
File.fnmatch_escape
would make more sense, imo.
Updated by headius (Charles Nutter) about 11 years ago
rkh (Konstantin Haase) wrote:
File.fnmatch_escape
would make more sense, imo.
But it would be harder to remember when what you want is "glob" :-)
Why not just {Dir
,File
}.quote
or .escape
, to match Regexp.quote
/escape
? I would vote for File.escape
, a method that escapes any file path to make it suitable for globbing.
Updated by steveklabnik (Steve Klabnik) about 11 years ago
I don't feel strongly about the name, specifically.
Updated by Eregon (Benoit Daloze) about 11 years ago
headius (Charles Nutter) wrote:
rkh (Konstantin Haase) wrote:
File.fnmatch_escape
would make more sense, imo.But it would be harder to remember when what you want is "glob" :-)
Why not just {
Dir
,File
}.quote
or.escape
, to matchRegexp.quote
/escape
? I would vote forFile.escape
, a method that escapes any file path to make it suitable for globbing.
I agree, this would be strictly superior.
I guess the most common use case is globbing on a directory recursively, so only the base directory is to be escaped, but this is not worth a specific method I think and could be done easily: Dir.glob("#{Dir.escape dir}/**/*.rb") { |file| ... }
Pathname
could likely avoid this problem nicely in this situation: dir = Pathname("some_dir"); dir.glob("**/*.rb") { |file| ... }
Updated by Eregon (Benoit Daloze) about 11 years ago
What is more worrying is implementations differ quite a bit in treating \
as an escape for these glob characters ({,},[,],*,?)
.
From my tests:
- MRI handle them fine
- Rubinius does not handle escaped
[
,{
and}
. - JRuby does not handle escaped
[
and]
(For details, see https://travis-ci.org/eregon/path/builds/6326360)
If I am not mistaken, escaping is as simple as: dir.gsub(/\[|\]|\*|\?|\{|\}/, '\\\\' + '\0')
.
Updated by rkh (Konstantin Haase) about 11 years ago
- Rubinius does not handle escaped
[
,{
and}
.- JRuby does not handle escaped
[
and]
These are implementation bugs, imo, and nothing to worry about here.
If I am not mistaken, escaping is as simple as:
dir.gsub(/\[|\]|\*|\?|\{|\}/, '\\\\' + '\0')
.
Yes, but it shifts responsibility for keeping this up to date from the user code to the Ruby implementation, and should be flag dependent. I.e. Ruby 2.0 introduced the EXTGLOB
flag.
Updated by Eregon (Benoit Daloze) about 11 years ago
rkh (Konstantin Haase) wrote:
- Rubinius does not handle escaped
[
,{
and}
.- JRuby does not handle escaped
[
and]
These are implementation bugs, imo, and nothing to worry about here.
But it means the problem will not be solved in the general case before a while.
It must also have been problematic for some time, so I guess we are not in a hurry either.
If I am not mistaken, escaping is as simple as:
dir.gsub(/\[|\]|\*|\?|\{|\}/, '\\\\' + '\0')
.Yes, but it shifts responsibility for keeping this up to date from the user code to the Ruby implementation,
I agree there should be Dir.escape or Dir.escape_glob
.
and should be flag dependent. I.e. Ruby 2.0 introduced the
EXTGLOB
flag.
Can you give examples? If it works for every case except FNM_NOESCAPE
, I think it is better to have a single simple way.
Updated by nobu (Nobuyoshi Nakada) about 11 years ago
(13/04/14 18:34), Eregon (Benoit Daloze) wrote:
I guess the most common use case is globbing on a directory recursively, so only the base directory is to be escaped, but this is not worth a specific method I think and could be done easily:
Dir.glob("#{Dir.escape dir}/**/*.rb") { |file| ... }
It reminded me about old proposal, Dir#glob
(not Dir.glob
).
--
Nobu Nakada
Updated by Eregon (Benoit Daloze) about 11 years ago
nobu (Nobuyoshi Nakada) wrote:
It reminded me about old proposal,
Dir#glob
(notDir.glob
).
Interesting, do you have a link?
Updated by Anonymous almost 10 years ago
An official API for escaping paths would be a hugely useful feature. In Homebrew, we use Dir[]
, Dir.glob
and Pathname.glob
a lot, but little attention has been paid to properly escaping paths, and over the years we have accumulated a great deal of potentially problematic code.
Benoit Daloze wrote:
Pathname
could likely avoid this problem nicely in this situation:dir = Pathname("some_dir"); dir.glob("**/*.rb") { |file| ... }
We also use Pathname
quite heavily in Homebrew and would definitely take advantage of this.
Updated by shyouhei (Shyouhei Urabe) about 6 years ago
- Status changed from Open to Feedback
Issue #13056 introduced base:
option to Dir.glob
method. Is this issue still needed?
Updated by Eregon (Benoit Daloze) about 6 years ago
Looks to me like this can be closed since we have Dir.glob(pattern, base: dir) and Pathname#glob uses it.
Updated by mame (Yusuke Endoh) about 6 years ago
Eregon (Benoit Daloze) wrote:
Looks to me like this can be closed since we have Dir.glob(pattern, base: dir) and Pathname#glob uses it.
Consider that we want to enumerate all files that are under a specified directory and whose name is also specified. If the name in question is "foo.txt" for example, we can do it by:
basedir = "/path/to/base/dir/"
filename = "foo.txt"
Dir.glob(basedir + "**/" + filename) # or Dir.glob("**/" + filename, base: basedir)?
However, if filename
is "foo[bar]baz.txt", this code does not work. In this case, this feature is still useful.
(I personally prefer File.fnmatch_escape
to Dir.escape_glob
.)
Updated by Eregon (Benoit Daloze) over 5 years ago
mame (Yusuke Endoh) wrote:
Eregon (Benoit Daloze) wrote:
Looks to me like this can be closed since we have Dir.glob(pattern, base: dir) and Pathname#glob uses it.
Consider that we want to enumerate all files that are under a specified directory and whose name is also specified. If the name in question is "foo.txt" for example, we can do it by:
basedir = "/path/to/base/dir/" filename = "foo.txt" Dir.glob(basedir + "**/" + filename) # or Dir.glob("**/" + filename, base: basedir)?
However, if
filename
is "foo[bar]baz.txt", this code does not work. In this case, this feature is still useful.
Because you'd want to list files whose name is actually "foo[bar]baz.txt"?
I see, makes sense.
My impression is everyone knows "glob'ing" and Dir.glob but very few know the cryptic "fnmatch", so Dir.escape_glob
seems easier to find.
Updated by hsbt (Hiroshi SHIBATA) over 2 years ago
- Project changed from 14 to Ruby master