https://redmine.ruby-lang.org/https://redmine.ruby-lang.org/favicon.ico?17113305112016-01-20T02:25:56ZRuby Issue Tracking SystemRuby master - Feature #12010: Exclude dot and dotdot from Dir#eachhttps://redmine.ruby-lang.org/issues/12010?journal_id=561792016-01-20T02:25:56Znaruse (Yui NARUSE)naruse@airemix.jp
<ul><li><strong>Description</strong> updated (<a title="View differences" href="/journals/56179/diff?detail_id=40063">diff</a>)</li></ul> Ruby master - Feature #12010: Exclude dot and dotdot from Dir#eachhttps://redmine.ruby-lang.org/issues/12010?journal_id=561812016-01-20T02:45:05Znormalperson (Eric Wong)normalperson@yhbt.net
<ul></ul><p><a href="mailto:naruse@airemix.jp" class="email">naruse@airemix.jp</a> wrote:</p>
<blockquote>
<p><code>Dir#each</code> and <code>Dir#read</code>) (including <code>Dir.entries</code>, <code>Dir.foreach</code> and other methods) return <code>"."</code> and <code>".."</code> at first.<br>
But through the all real use case <code>"."</code> and <code>".."</code> are useless.<br>
How about excluding them?</p>
</blockquote>
<p>If Ruby were a new language, yes. But I think it is too risky, now.</p>
<blockquote>
<pre><code class="diff syntaxhl" data-language="diff"><span class="gi">+#define DIR_IS_DOT_OR_DOTDOT(dp) ((dp)->d_name[0] == '.' && \
+ ((dp)->d_name[1] == '\0' || ((dp)->d_name[1] == '.' && (dp)->d_name[2] == '\0')))
</span></code></pre>
</blockquote>
<p>Anyways, I prefer we reduce macro usage and use static inline functions<br>
to avoid potential side effects, extra <code>'\'</code> and parentheses.</p> Ruby master - Feature #12010: Exclude dot and dotdot from Dir#eachhttps://redmine.ruby-lang.org/issues/12010?journal_id=561822016-01-20T03:11:31Znobu (Nobuyoshi Nakada)nobu@ruby-lang.org
<ul><li><strong>Description</strong> updated (<a title="View differences" href="/journals/56182/diff?detail_id=40064">diff</a>)</li></ul><p><code>d_name</code> might not be NUL-terminated, use <code>NAMLEN()</code> on such platforms.</p>
<pre><code class="diff syntaxhl" data-language="diff"><span class="gh">diff --git a/dir.c b/dir.c
index 193b5be..28a1e79 100644
</span><span class="gd">--- a/dir.c
</span><span class="gi">+++ b/dir.c
</span><span class="p">@@ -21,6 +21,7 @@</span>
#include <unistd.h>
#endif
<span class="gi">+#undef HAVE_DIRENT_NAMLEN
</span> #if defined HAVE_DIRENT_H && !defined _WIN32
# include <dirent.h>
# define NAMLEN(dirent) strlen((dirent)->d_name)
<span class="p">@@ -30,6 +31,7 @@</span>
#else
# define dirent direct
# define NAMLEN(dirent) (dirent)->d_namlen
<span class="gi">+# define HAVE_DIRENT_NAMLEN 1
</span> # if HAVE_SYS_NDIR_H
# include <sys/ndir.h>
# endif
<span class="p">@@ -699,6 +701,26 @@</span> fundamental_encoding_p(rb_encoding *enc)
#else
# define READDIR(dir, enc) readdir((dir))
#endif
<span class="gi">+static int
+to_be_skipped(const struct dirent *dp)
+{
+ const char *name = dp->d_name;
+ if (name[0] != '.') return FALSE;
+#ifdef HAVE_DIRENT_NAMLEN
+ switch (NAMLEN(dp)) {
+ case 2:
+ if (name[1] != '.') return FALSE;
+ case 1:
+ return TRUE;
+ default:
+ }
+#else
+ if (!name[1]) return TRUE;
+ if (name[1] != '.') return FALSE;
+ if (!name[2]) return TRUE;
+#endif
+ return FALSE;
+}
</span>
/*
* call-seq:
<span class="p">@@ -720,13 +742,12 @@</span> dir_read(VALUE dir)
GetDIR(dir, dirp);
errno = 0;
<span class="gd">- if ((dp = READDIR(dirp->dir, dirp->enc)) != NULL) {
- return rb_external_str_new_with_enc(dp->d_name, NAMLEN(dp), dirp->enc);
- }
- else {
- if (errno != 0) rb_sys_fail(0);
- return Qnil; /* end of stream */
</span><span class="gi">+ while ((dp = READDIR(dirp->dir, dirp->enc)) != NULL) {
+ if (!to_be_skipped(dp))
+ return rb_external_str_new_with_enc(dp->d_name, NAMLEN(dp), dirp->enc);
</span> }
<span class="gi">+ if (errno != 0) rb_sys_fail(0);
+ return Qnil; /* end of stream */
</span> }
/*
<span class="p">@@ -762,8 +783,10 @@</span> dir_each(VALUE dir)
IF_NORMALIZE_UTF8PATH(norm_p = need_normalization(dirp->dir, RSTRING_PTR(dirp->path)));
while ((dp = READDIR(dirp->dir, dirp->enc)) != NULL) {
const char *name = dp->d_name;
<span class="gd">- size_t namlen = NAMLEN(dp);
</span><span class="gi">+ size_t namlen;
</span> VALUE path;
<span class="gi">+ if (to_be_skipped(dp)) continue;
+ namlen = NAMLEN(dp);
</span> #if NORMALIZE_UTF8PATH
if (norm_p && has_nonascii(name, namlen) &&
!NIL_P(path = rb_str_normalize_ospath(name, namlen))) {
</code></pre> Ruby master - Feature #12010: Exclude dot and dotdot from Dir#eachhttps://redmine.ruby-lang.org/issues/12010?journal_id=561952016-01-20T10:26:43Znaruse (Yui NARUSE)naruse@airemix.jp
<ul></ul><p>Nobuyoshi Nakada wrote:</p>
<blockquote>
<p><code>d_name</code> might not be NUL-terminated, use <code>NAMLEN()</code> on such platforms.</p>
</blockquote>
<p>What is the platform?<br>
POSIX says it is NUL-terminated.<br>
<a href="http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/dirent.h.html" class="external">http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/dirent.h.html</a></p> Ruby master - Feature #12010: Exclude dot and dotdot from Dir#eachhttps://redmine.ruby-lang.org/issues/12010?journal_id=561982016-01-20T12:33:59ZEregon (Benoit Daloze)
<ul></ul><p>Yui NARUSE wrote:</p>
<blockquote>
<p><code>Dir#each</code> and <code>Dir#read</code> (including <code>Dir.entries</code>, <code>Dir.foreach</code> and other methods) return <code>"."</code> and <code>".."</code> at first.<br>
But through the all real use case <code>"."</code> and <code>".."</code> are useless.<br>
How about excluding them?</p>
</blockquote>
<p>Strongly agreed, I had this on my "things to report" list forever but never got around to report it.</p>
<p>I am unsure of the potential compatibility issues.<br>
I guess code ignoring "." and ".." will behave just fine though.</p>
<p>P.S.: "." and ".." are not necessarily at first it seems, at least on Linux.</p> Ruby master - Feature #12010: Exclude dot and dotdot from Dir#eachhttps://redmine.ruby-lang.org/issues/12010?journal_id=562232016-01-21T02:10:31Znobu (Nobuyoshi Nakada)nobu@ruby-lang.org
<ul></ul><p>Yui NARUSE wrote:</p>
<blockquote>
<p>What is the platform?</p>
</blockquote>
<p>Win32 was in my mind, but <code>opendir_internal()</code> terminates it.<br>
So it "should respect <code>d_namlen</code> if available".</p> Ruby master - Feature #12010: Exclude dot and dotdot from Dir#eachhttps://redmine.ruby-lang.org/issues/12010?journal_id=562252016-01-21T02:28:28Znaruse (Yui NARUSE)naruse@airemix.jp
<ul></ul><p>Nobuyoshi Nakada wrote:</p>
<blockquote>
<p>Yui NARUSE wrote:</p>
<blockquote>
<p>What is the platform?</p>
</blockquote>
<p>Win32 was in my mind, but <code>opendir_internal()</code> terminates it.<br>
So it "should respect <code>d_namlen</code> if available".</p>
</blockquote>
<p>It sounds "don't need to respect".</p> Ruby master - Feature #12010: Exclude dot and dotdot from Dir#eachhttps://redmine.ruby-lang.org/issues/12010?journal_id=562302016-01-21T03:46:47Znobu (Nobuyoshi Nakada)nobu@ruby-lang.org
<ul><li><strong>Related to</strong> <i><a class="issue tracker-2 status-5 priority-4 priority-default closed" href="/issues/10121">Feature #10121</a>: Dir.empty?</i> added</li></ul> Ruby master - Feature #12010: Exclude dot and dotdot from Dir#eachhttps://redmine.ruby-lang.org/issues/12010?journal_id=562332016-01-21T03:50:53Zduerst (Martin Dürst)duerst@it.aoyama.ac.jp
<ul></ul><p>Hello Eric,</p>
<p>On 2016/01/20 11:43, Eric Wong wrote:</p>
<blockquote>
<p><a href="mailto:naruse@airemix.jp" class="email">naruse@airemix.jp</a> wrote:</p>
<blockquote>
<p>Dir#each and Dir#read) (including Dir.entries, Dir.foreach and other methods) return "." and ".." at first.<br>
But through the all real use case "." and ".." are useless.<br>
How about excluding them?</p>
</blockquote>
<p>If Ruby were a new language, yes. But I think it is too risky, now.</p>
</blockquote>
<p>Can somebody do a code search for this? I know AKR is good at that (but<br>
I don't want to ask him to do this).</p>
<blockquote>
<blockquote>
<p>+#define DIR_IS_DOT_OR_DOTDOT(dp) ((dp)->d_name[0] == '.' && \</p>
<ul>
<li>((dp)->d_name[1] == '\0' || ((dp)->d_name[1] == '.' && (dp)->d_name[2] == '\0')))</li>
</ul>
</blockquote>
<p>Anyways, I prefer we reduce macro usage and use static inline functions<br>
to avoid potential side effects, extra '' and parentheses.</p>
</blockquote>
<p>Are inline functions now an accepted concept in C? It seems they are<br>
available from C99, but then so are // comments, and they are not yet<br>
accepted in Ruby because of some old compilers.</p>
<p>Also, are there C compilers that inline functions as part of<br>
optimization, even if they aren't marked as such in the source?</p>
<p>While I very much understand your point re. potential side effects,<br>
using inline functions also may save space at the cost of time, in<br>
particular for long macros. Is that also one of the reasons you favor them?</p>
<p>Regards, Martin.</p>
<p>Unsubscribe: <a href="mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe" class="email">mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe</a><br>
<a href="http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core" class="external">http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core</a></p> Ruby master - Feature #12010: Exclude dot and dotdot from Dir#eachhttps://redmine.ruby-lang.org/issues/12010?journal_id=562342016-01-21T04:25:39Zakr (Akira Tanaka)akr@fsij.org
<ul><li><strong>File</strong> <a href="/attachments/5751">dir-entries-usages.txt</a> <a class="icon-only icon-download" title="Download" href="/attachments/download/5751/dir-entries-usages.txt">dir-entries-usages.txt</a> added</li></ul><p>I searched Dir.entries on gems.</p>
<p>It seems there are size/length invocations on the result of Dir.entries.</p>
<p>Note that I used <a href="https://github.com/akr/gem-codesearch" class="external">https://github.com/akr/gem-codesearch</a></p> Ruby master - Feature #12010: Exclude dot and dotdot from Dir#eachhttps://redmine.ruby-lang.org/issues/12010?journal_id=562352016-01-21T05:02:24Znormalperson (Eric Wong)normalperson@yhbt.net
<ul></ul><p>"Martin J. Dürst" <a href="mailto:duerst@it.aoyama.ac.jp" class="email">duerst@it.aoyama.ac.jp</a> wrote:</p>
<blockquote>
<p>On 2016/01/20 11:43, Eric Wong wrote:</p>
<blockquote>
<p><a href="mailto:naruse@airemix.jp" class="email">naruse@airemix.jp</a> wrote:</p>
<blockquote>
<p>Dir#each and Dir#read) (including Dir.entries, Dir.foreach and other methods) return "." and ".." at first.<br>
But through the all real use case "." and ".." are useless.<br>
How about excluding them?</p>
</blockquote>
<p>If Ruby were a new language, yes. But I think it is too risky, now.</p>
</blockquote>
<p>Can somebody do a code search for this? I know AKR is good at that<br>
(but I don't want to ask him to do this).</p>
</blockquote>
<p>I just found some in yahns which I wrote:<br>
<a href="http://yhbt.net/yahns.git/plain/extras/autoindex.rb" class="external">http://yhbt.net/yahns.git/plain/extras/autoindex.rb</a></p>
<p>Easily fixable, but we also don't know what kinds of code people<br>
have in private.</p>
<blockquote>
<blockquote>
<p>Anyways, I prefer we reduce macro usage and use static inline functions<br>
to avoid potential side effects, extra '' and parentheses.</p>
</blockquote>
<p>Are inline functions now an accepted concept in C? It seems they are<br>
available from C99, but then so are // comments, and they are not<br>
yet accepted in Ruby because of some old compilers.</p>
</blockquote>
<p>It seems so, we already have static inlines everywhere in Ruby.<br>
I guess AC_C_INLINE takes care of the portability inside configure.in</p>
<p>'//' comments can't be worked around using CPP on old compilers.</p>
<p>Fwiw, git and Linux kernel both reject '//' despite using<br>
static inlines heavily; Linux also uses C99 struct initializers,<br>
but git does not as git targets more compilers than Linux.</p>
<blockquote>
<p>Also, are there C compilers that inline functions as part of<br>
optimization, even if they aren't marked as such in the source?</p>
</blockquote>
<p>Yes, actually I often favor plain 'static' to let the compiler<br>
more make decisions. In my experience with gcc; single-use<br>
'static' functions always get inlined (but I don't look at giant<br>
functions much). Usually, smaller icache footprints are<br>
better for overall performance(*).</p>
<p>gcc will also complain about unused 'static', but not 'static inline'<br>
in header files.</p>
<blockquote>
<p>While I very much understand your point re. potential side effects,<br>
using inline functions also may save space at the cost of time, in<br>
particular for long macros. Is that also one of the reasons you<br>
favor them?</p>
</blockquote>
<p>Yes, space, too. I prefer to trust the compiler as much as possible<br>
in the hopes they can make better decisions than me (or may eventually<br>
do so, perhaps with things like PGO).</p>
<p>Functions also get type-checked at compile time. Compile-time<br>
type-checking is nice in C :)</p>
<p>(*) Of course, we default to -O3 in which makes our icache footprint<br>
huge. I've tried with -O2 in the past and unfortunately our<br>
benchmarks got slower. It seems there's some hotspots in the<br>
VM core loop that benefit from -O3...</p>
<p>Unsubscribe: <a href="mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe" class="email">mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe</a><br>
<a href="http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core" class="external">http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core</a></p> Ruby master - Feature #12010: Exclude dot and dotdot from Dir#eachhttps://redmine.ruby-lang.org/issues/12010?journal_id=562372016-01-21T07:47:47Znaruse (Yui NARUSE)naruse@airemix.jp
<ul></ul><p>Akira Tanaka wrote:</p>
<blockquote>
<p>I searched Dir.entries on gems.</p>
<p>It seems there are size/length invocations on the result of Dir.entries.</p>
<p>Note that I used <a href="https://github.com/akr/gem-codesearch" class="external">https://github.com/akr/gem-codesearch</a></p>
</blockquote>
<p>Thanks, I hadn't thought up such use case.<br>
It sounds need a treatment at least.</p>
<p>Could you check about Dir.foreach?</p> Ruby master - Feature #12010: Exclude dot and dotdot from Dir#eachhttps://redmine.ruby-lang.org/issues/12010?journal_id=562382016-01-21T07:58:22Znaruse (Yui NARUSE)naruse@airemix.jp
<ul></ul><p>Eric Wong wrote:</p>
<blockquote>
<p>"Martin J. Dürst" <a href="mailto:duerst@it.aoyama.ac.jp" class="email">duerst@it.aoyama.ac.jp</a> wrote:</p>
<blockquote>
<p>On 2016/01/20 11:43, Eric Wong wrote:</p>
<blockquote>
<p><a href="mailto:naruse@airemix.jp" class="email">naruse@airemix.jp</a> wrote:</p>
<blockquote>
<p>Dir#each and Dir#read) (including Dir.entries, Dir.foreach and other methods) return "." and ".." at first.<br>
But through the all real use case "." and ".." are useless.<br>
How about excluding them?</p>
</blockquote>
<p>If Ruby were a new language, yes. But I think it is too risky, now.</p>
</blockquote>
<p>Can somebody do a code search for this? I know AKR is good at that<br>
(but I don't want to ask him to do this).</p>
</blockquote>
<p>I just found some in yahns which I wrote:<br>
<a href="http://yhbt.net/yahns.git/plain/extras/autoindex.rb" class="external">http://yhbt.net/yahns.git/plain/extras/autoindex.rb</a></p>
</blockquote>
<p>Hmm...<br>
Emulating directory listing sounds a valid use case...</p> Ruby master - Feature #12010: Exclude dot and dotdot from Dir#eachhttps://redmine.ruby-lang.org/issues/12010?journal_id=563272016-01-22T03:38:53Zkddnewton (Kevin Newton)kddnewton@gmail.com
<ul></ul><p>Seems like a good place for an option in these methods, but default it to include to make migration easier.</p> Ruby master - Feature #12010: Exclude dot and dotdot from Dir#eachhttps://redmine.ruby-lang.org/issues/12010?journal_id=565362016-01-23T08:42:09Zshevegen (Robert A. Heiler)shevegen@gmail.com
<ul></ul><blockquote>
<p>default it to include to make migration easier</p>
</blockquote>
<p>I dunno. I never found a usecase for <code>'.'</code> and <code>'..'</code> so I would have no migration need at<br>
all since I did not use it. And in the old cases where I used it, I always ended<br>
up filtering away the <code>'.'</code> and <code>'..'</code> since I have no idea what to do with it.</p>
<p>But I also do not use <code>Dir.entries</code> much at all either since many years. I fell in love<br>
with <code>Dir[]</code> and have been using that ever since, respectively <code>Dir['*']</code> or any other<br>
filter. I assume that perhaps <code>Dir.entries()</code> is not used that much in "modern" ruby<br>
code.</p>
<p>I did a grep in the rack-1.6.4 source, just to use something as reference :-)</p>
<p>4 Instances of <code>Dir[]</code>, one in lib/rack/directory.rb, two in rack.gemspec,<br>
one in a Rakefile.</p>
<p>And no instance of <code>Dir.entries</code> hehe</p> Ruby master - Feature #12010: Exclude dot and dotdot from Dir#eachhttps://redmine.ruby-lang.org/issues/12010?journal_id=707262018-03-01T01:13:04Znobu (Nobuyoshi Nakada)nobu@ruby-lang.org
<ul><li><strong>Related to</strong> <i><a class="issue tracker-2 status-5 priority-4 priority-default closed" href="/issues/13969">Feature #13969</a>: Dir#each_child</i> added</li></ul> Ruby master - Feature #12010: Exclude dot and dotdot from Dir#eachhttps://redmine.ruby-lang.org/issues/12010?journal_id=707282018-03-01T01:13:34Znobu (Nobuyoshi Nakada)nobu@ruby-lang.org
<ul></ul><p>This can be closed by <a class="issue tracker-2 status-5 priority-4 priority-default closed" title="Feature: Dir#each_child (Closed)" href="https://redmine.ruby-lang.org/issues/13969">#13969</a>?</p> Ruby master - Feature #12010: Exclude dot and dotdot from Dir#eachhttps://redmine.ruby-lang.org/issues/12010?journal_id=941502021-10-16T00:30:46Zjeremyevans0 (Jeremy Evans)merch-redmine@jeremyevans.net
<ul><li><strong>Status</strong> changed from <i>Assigned</i> to <i>Closed</i></li></ul>