Project

General

Profile

Actions

Bug #6228

closed

[mingw] Errno::EBADF in ruby/test_io.rb on ruby_1_9_3

Added by jonforums (Jon Forums) over 12 years ago. Updated about 12 years ago.

Status:
Closed
Target version:
ruby -v:
ruby 1.9.3p172 (2012-03-30 revision 35179) [i386-mingw32]
Backport:
[ruby-core:43951]

Description

On Win7 32bit built with MinGW 4.6.2 using the RubyInstaller build recipes, I get the following test error:

sh-3.1$ ruby --version
ruby 1.9.3p172 (2012-03-30 revision 35179) [i386-mingw32]

sh-3.1$ make test-all TESTS='openssl fiddle psych zlib io json mkmf pathname stringio erb
fileutils ruby/test_io.rb ruby/test_io_m17n.rb ruby/test_file.rb'

  1. Error:
    test_lines(TestIO):
    Errno::EBADF: Bad file descriptor
    c:/Users/Jon/Documents/RubyDev/ruby-git/test/ruby/test_io.rb:33:in close' c:/Users/Jon/Documents/RubyDev/ruby-git/test/ruby/test_io.rb:33:in rescue in block in pipe'
    c:/Users/Jon/Documents/RubyDev/ruby-git/test/ruby/test_io.rb:30:in `block in pipe'

Updated by naruse (Yui NARUSE) over 12 years ago

Do you know which revision break it?

Updated by jonforums (Jon Forums) over 12 years ago

Haven't found root cause, but these commits also have Errno::EBADF errors:

commit revision patch


b1bcb82 r34865 p154
1b21d03 r34811 p146
7e2e3dc r34806 p144

In the above commits, the failures are at a different location in ruby/test_io.rb likely because I'm not building deps in order to speed things up:

  1. Error:
    test_invalid_advise(TestIO):
    Errno::EBADF: Bad file descriptor - C:/Users/Jon/AppData/Local/Temp/test_io20120330-6868-1lqi3p9
    c:/Users/Jon/Documents/RubyDev/ruby-git/test/ruby/test_io.rb:1119:in make_tempfile' c:/Users/Jon/Documents/RubyDev/ruby-git/test/ruby/test_io.rb:1848:in block (2 levels) in test_invalid_advise'
    c:/Users/Jon/Documents/RubyDev/ruby-git/test/ruby/test_io.rb:1847:in each' c:/Users/Jon/Documents/RubyDev/ruby-git/test/ruby/test_io.rb:1847:in block in test_invalid_advise'
    c:/Users/Jon/Documents/RubyDev/ruby-git/test/ruby/test_io.rb:1846:in each' c:/Users/Jon/Documents/RubyDev/ruby-git/test/ruby/test_io.rb:1846:in test_invalid_advise'
    ../../../../Users/Jon/Documents/RubyDev/ruby-git/test/runner.rb:15:in `'

I suspect this testing difference isn't relevant since EBADF errors persist, but it is a difference.

Time permitting this weekend, I'll try again using git bisect

Updated by jonforums (Jon Forums) over 12 years ago

In my configuration, one fix is to remove the recent -DFD_SETSEZE=32767 define from the RubyInstaller build recipe. Removing the define gives the following results on r35179:

  • no deps build using WinSDK 7.1 (without RubyInstaller) has no test_lines(TestIO) nor test_invalid_advise(TestIO) Errno::EBADF failures
  • no deps build using MinGW GCC 4.6.2 (with or without RubyInstaller) has no test_lines nor test_invalid_advise failures
  • full deps build using MinGW GCC 4.6.2 (with RubyInstaller recipe) has no test_lines nor test_invalid_advise failures

However, removing the define is not acceptable as it was added to fix the following user issue

https://github.com/oneclick/rubyinstaller/issues/104

And it get's more convoluted.

Running git bisect between commits a81c766/r34649 (good) and 7e2e3dc/r34806 (bad) with -DFD_SETSEZE=32767 enabled in the RubyInstaller recipes, the "first bad commit" is Nobu's 35bf809/r34749 tweak to Enumerator in io.c. Building the "good" fa1a2ae/r34746 parent commit (without deps) resulted in no test_invalid_advise failure when running make test-all TESTS='ruby/test_io.rb'. But strangely, test_lines failed while test_invalid_advise passed when running my original make test-all TESTS='openssl ... ruby/test_io.rb ...'

I suspect there are other lurking issues left to discover, so don't close this issue as a 3rd party issue until we know more.

Updated by jonforums (Jon Forums) over 12 years ago

of course it's the FD_SETSIZE macro I'm talking about...

Updated by naruse (Yui NARUSE) over 12 years ago

  • Status changed from Open to Assigned
  • Assignee set to luislavena (Luis Lavena)

Updated by h.shirosaki (Hiroshi Shirosaki) over 12 years ago

This issue seems to be related to GC finalizer.
GC is triggered just before Errno::EBADF occurs, and finalizer closes the fd of make_tempfile. Errno::EBADF is raised for trying to close again the closed fd of make_tempfile.

This seems due to test_flush_in_finalizer1.

 500.times {
   f = File.open(path, "w")
   fds << f.fileno
   f.print "hoge"
 }

f is not closed. When GC sweeps f, the fd associated with f seems to be closed.
fds contains 3. And make_tempfile in test_invalid_advise creates fd of 3.

I confirmed the error disappeared with the following patch.
-DFD_SETSEZE=32767 and ruby 1.9.3p184 (2012-04-15 revision 35335) [i386-mingw32] on Windows 7.

diff --git a/test/ruby/test_io.rb b/test/ruby/test_io.rb
index edf449d..70ca240 100644
--- a/test/ruby/test_io.rb
+++ b/test/ruby/test_io.rb
@@ -1800,6 +1800,7 @@ End
end
ensure
fds.each {|fd| IO.for_fd(fd).close rescue next}

  • GC.start
    end

def test_flush_in_finalizer2

Updated by h.shirosaki (Hiroshi Shirosaki) over 12 years ago

I tried to create simplified test case of this issue.

cat gc_io_test.rb
f = open("test_file", "w")
p f.fileno
IO.for_fd(f.fileno).close

f = open("test_file", "w")
p f.fileno

GC.start # trigger GC

begin
f.close # => raise Errno::EBADF
rescue
p $!
end

File.unlink("test_file")

~/ruby-trunk/bin/ruby -v gc_io_test.rb
ruby 2.0.0dev (2012-04-17 trunk 35363) [x86_64-darwin11.3.0]
5
5
#<Errno::EBADF: Bad file descriptor - test_file>

Updated by luislavena (Luis Lavena) over 12 years ago

  • Status changed from Assigned to Open
  • Assignee deleted (luislavena (Luis Lavena))

Who should be the responsible for GC? Seems to me that open file descriptor shouldn't be garbage collected while still open.

Updated by phasis68 (Heesob Park) over 12 years ago

I found that this issue is related with the auto-close flag of IO.

The following code fails with Errno::EBADF

f = open("test_file", "w")
IO.for_fd(f.fileno).autoclose = true # default value
GC.start
f.close

But, the following code works fine.

f = open("test_file", "w")
IO.for_fd(f.fileno).autoclose = false
GC.start
f.close

Updated by mame (Yusuke Endoh) over 12 years ago

  • Status changed from Open to Assigned
  • Assignee set to authorNari (Narihiro Nakamura)

Who should be the responsible for GC?

Nari-san.

--
Yusuke Endoh

Updated by nagachika (Tomoyuki Chikanaga) over 12 years ago

Hi,

I think the root cause is wrong usage of IO.for_fd in ensure clause of test_flush_finalizer1,
and just call GC.start works fine for recycle fds.

This test was written by nobu.

nakada-san
Is the following patch seems ok?

diff --git a/test/ruby/test_io.rb b/test/ruby/test_io.rb
index ebf9b76..be5a333 100644
--- a/test/ruby/test_io.rb
+++ b/test/ruby/test_io.rb
@@ -1891,7 +1891,7 @@ End
}
end
ensure

  • fds.each {|fd| IO.for_fd(fd).close rescue next}
  • GC.start
    end

def test_flush_in_finalizer2

Updated by jonforums (Jon Forums) over 12 years ago

Why is the following

ensure
fds.each {|fd| IO.for_fd(fd).close rescue next}

wrong usage of IO.for_fd?

In normal usage, I would likely use this style of explicit closure (by IO wrapping the fd) but not kick off a GC cycle with GC.start.

I'd like to understand why the usage is wrong. Specifically, is there another style of explicit recycling of fds that doesn't require starting a GC cycle?

Updated by mame (Yusuke Endoh) over 12 years ago

  • Assignee changed from authorNari (Narihiro Nakamura) to nobu (Nobuyoshi Nakada)

+1 for nagachika-san.

jonforums (Jon Forums) wrote:

Why is the following

ensure
fds.each {|fd| IO.for_fd(fd).close rescue next}

wrong usage of IO.for_fd?

f = open("...") # this first IO has a fd X which is opened now.
IO.for_fd(f.fileno) # this second IO also has the fd X.
GC.start # the second IO is GC'ed; the fd X is auto-closed.
f.close # this attempts to close the fd X, but it is already closed => Errno::EBADF

Specifically, is there another style of explicit recycling of fds that doesn't require starting a GC cycle?

Call IO#close explicitly.

--
Yusuke Endoh

Updated by jonforums (Jon Forums) over 12 years ago

mame (Yusuke Endoh) wrote:

jonforums (Jon Forums) wrote:

Why is the following

ensure
fds.each {|fd| IO.for_fd(fd).close rescue next}

wrong usage of IO.for_fd?

f = open("...") # this first IO has a fd X which is opened now.
IO.for_fd(f.fileno) # this second IO also has the fd X.
GC.start # the second IO is GC'ed; the fd X is auto-closed.
f.close # this attempts to close the fd X, but it is already closed => Errno::EBADF

Specifically, is there another style of explicit recycling of fds that doesn't require starting a GC cycle?

Call IO#close explicitly.

Endoh-san, thank you. I understand.

Why do you think the same incorrect test code fails (consistently, not intermittently) on windows when only using -DFD_SETSIZE, but doesn't appear to fail on *nix? On Arch, I've not been able to repro when building using -DFD_SETSIZE=32767. This may indicate a lurking weakness in an _WIN32 MRI codepath (or winsock2 peculiarity) that doesn't exist (or does a better job of hiding) on *nix.

FYI, there's another recent issue on windows in which -DFD_SETSIZE=32767 exposes a SEGV with EventMachine with a simple EM.run {}. By default, EM sets FD_SETSIZE to 1024. Rebuilding ruby on windows without the FD_SETSIZE override fixes the SEGV. The EM issue is a separate issue than this one, but may be related. I haven't started investigating, so FYI.

I will test again on ruby_1_9_3 and trunk with the suggested patch and reply if errors persist.

Jon

Actions #16

Updated by h.shirosaki (Hiroshi Shirosaki) over 12 years ago

Why do you think the same incorrect test code fails (consistently, not intermittently) on windows when only using -DFD_SETSIZE, but doesn't appear to fail on *nix? On Arch, I've not been able to repro when building using -DFD_SETSIZE=32767. This may indicate a lurking weakness in an _WIN32 MRI codepath (or winsock2 peculiarity) that doesn't exist (or does a better job of hiding) on *nix.

This issue depends on the timing when IO object is GCed. GC of an object with the same fd must occur between open() and close().
I guess -DFD_SETSIZE=32767 would change memory usage and GC timing. MRI implementation is different between WIndows and Linux, so GC timing would differ.

FYI, there's another recent issue on windows in which -DFD_SETSIZE=32767 exposes a SEGV with EventMachine with a simple EM.run {}. By default, EM sets FD_SETSIZE to 1024. Rebuilding ruby on windows without the FD_SETSIZE override fixes the SEGV. The EM issue is a separate issue than this one, but may be related. I haven't started investigating, so FYI.

I think above issue is cause of fd_array buffer overflow.

typedef struct fd_set
{
u_int fd_count;
SOCKET fd_array[FD_SETSIZE];
} fd_set;

On EM, FD_SETSIZE = 1024 and fd_array[1024].
EM uses FD_SET() and FD_SET() seems rb_w32_fdset() on Windows.

https://github.com/ruby/ruby/blob/trunk/win32/win32.c#L2458

In rb_w32_fdset(), FD_SETSIZE = 32767 since rb_w32_fdset is compiled with -DFD_SETSIZE=32767.

if (i == set->fd_count) {
    if (set->fd_count < FD_SETSIZE) { // FD_SETSIZE = 32767
        set->fd_array[i] = s;                 // `i` could be over 1023
        set->fd_count++;
    }
}

If above scenario is correct, FD_SETSIZE of Ruby should be equal or less then FD_SETSIZE of EM.

include/winsock2.h has FD_SET macro on mingw, but MRI undef FD_SET and uses rb_w32_fdset() function. It might be better that FD_SET() is macro instead of function.

Updated by luislavena (Luis Lavena) over 12 years ago

  • Assignee changed from nobu (Nobuyoshi Nakada) to usa (Usaku NAKAMURA)

Usa-san, what do you think about Hiroshi suggestion about leave FD_SET() as macro instead?

Updated by usa (Usaku NAKAMURA) over 12 years ago

Hello,

In message "[ruby-core:44560] [ruby-trunk - Bug #6228] [mingw] Errno::EBADF in ruby/test_io.rb on ruby_1_9_3"
on Apr.24,2012 00:38:57, wrote:

Usa-san, what do you think about Hiroshi suggestion about leave FD_SET() as macro instead?

First, as Jon mentioned, the SEGV of EM is another issue.
Please split it from this ticket.

Second, I don't understand how Hiroshi want to change the code,
so I cannot say anything about this.

Regards,

U.Nakamura

Updated by luislavena (Luis Lavena) over 12 years ago

  • Assignee changed from usa (Usaku NAKAMURA) to luislavena (Luis Lavena)

usa (Usaku NAKAMURA) wrote:

First, as Jon mentioned, the SEGV of EM is another issue.
Please split it from this ticket.

Understood, creating another one to have FD_SET() discussion.

I'll commit suggested patch from nagachika-san.

Thank you.

Updated by luislavena (Luis Lavena) over 12 years ago

  • Status changed from Assigned to Feedback

luislavena (Luis Lavena) wrote:

I'll commit suggested patch from nagachika-san.

I seem unable to reproduce the exact exception on both trunk and ruby_1_9_3 branches. The error seems randomly manifest.

Right now, against ruby_1_9_3, it shows in test_invalid_advise

  5) Error:
test_invalid_advise(TestIO):
Errno::EBADF: Bad file descriptor - C:/Users/Luis/AppData/Local/Temp/test_io20120427-1432-1xmkrxh
    C:/Users/Luis/Projects/oss/ruby/test/ruby/test_io.rb:1119:in `make_tempfile'
    C:/Users/Luis/Projects/oss/ruby/test/ruby/test_io.rb:1872:in `block (2 levels) in test_invalid_advise'
    C:/Users/Luis/Projects/oss/ruby/test/ruby/test_io.rb:1871:in `each'
    C:/Users/Luis/Projects/oss/ruby/test/ruby/test_io.rb:1871:in `block in test_invalid_advise'
    C:/Users/Luis/Projects/oss/ruby/test/ruby/test_io.rb:1870:in `each'
    C:/Users/Luis/Projects/oss/ruby/test/ruby/test_io.rb:1870:in `test_invalid_advise'
    ../test/runner.rb:15:in `'

Updated by jonforums (Jon Forums) over 12 years ago

I seem unable to reproduce the exact exception on both trunk and ruby_1_9_3 branches. The error seems randomly manifest.

Right now, against ruby_1_9_3, it shows in test_invalid_advise

I've seen failures in both test_invalid_advise (see comment #3) and test_lines. Both disappeared when I tested Nagachika-san's suggested patch.

I'll try again with the latest trunk and ruby_1_9_3, but IIRC, the test_invalid_advise failure occurred when I built without deps.

Updated by jonforums (Jon Forums) over 12 years ago

On Win7 32bit, using the following build recipe on both trunk@r35485 and ruby_1_9_3@r35484 with MinGW GCC 4.6.2

sh -c "../configure --enable-shared --disable-install-doc CPPFLAGS='-DFD_SETSIZE=32767'"
make
make test-all TESTS="ruby/test_io.rb"

I get only the test_invalid_advise(TestIO) error on ruby_1_9_3, and a freeze (cleared by CTRL-C) on trunk. Nagachika-san's suggested patch fixes the errors as does adding TestIO#setup and TestIO#teardown containing GC.disable and GC.enable, respectively.

Actions #23

Updated by luislavena (Luis Lavena) over 12 years ago

  • Status changed from Feedback to Closed
  • % Done changed from 0 to 100

This issue was solved with changeset r35631.
Jon, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.


  • test/ruby/test_io.rb (class TestIO): Disable GC during IO tests to
    avoid file descriptors being GC'ed. Suggestion by Tomoyuki Chikanaga
    [ruby-core:43951][Bug #6228]

Updated by luislavena (Luis Lavena) over 12 years ago

  • Status changed from Closed to Assigned
  • % Done changed from 100 to 0

Reopening as reported in [ruby-core:45015] this breaks chkbuild on debian.

Updated by h.shirosaki (Hiroshi Shirosaki) over 12 years ago

Hi Luis,

I think Nagachika-san's original patch as below is sufficient for this issue. GC.disable for all tests seems too far.
Did you test with this patch?

diff --git a/test/ruby/test_io.rb b/test/ruby/test_io.rb
index ebf9b76..be5a333 100644
--- a/test/ruby/test_io.rb
+++ b/test/ruby/test_io.rb
@@ -1891,7 +1891,7 @@ End
}
end
ensure

  • fds.each {|fd| IO.for_fd(fd).close rescue next}
  • GC.start
    end

def test_flush_in_finalizer2

Updated by nagachika (Tomoyuki Chikanaga) over 12 years ago

Hi Luis,

Sorry for a lack of enough explanations about my patch.

The ensure clause of test_flush_in_finalizer1 seems to try close all IO objects opened above and not GC'ed & finalized using IO.for_fd. But it's wrong usage of IO.for_fd. The scenario is as follows.

io = open("..")
fd = io.fileno
io = nil
IO.for_fd(fd).close # <- at this moment, io (if not GC'ed yet) has invalid fd,
# because it's already closed by IO.for_fd(fd).close
new_io = open("...") # new_io might have same fd of io, because file descriptor can be recycled.
... # => io is GC'ed and finalized here. Now new_io has closed fd!
new_io.gets # => raised Errno::EBADF

test_flush_in_finalizer1 is test for finalizer of IO (see #3910) and GC.disable breaks it's intent.
I'm sorry, but I'd like to revert r35631.
And I'll commit my patch. Please check it.

Actions #27

Updated by nagachika (Tomoyuki Chikanaga) over 12 years ago

  • Status changed from Assigned to Closed
  • % Done changed from 0 to 100

This issue was solved with changeset r35637.
Jon, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.


  • test/ruby/test_io.rb (test_flush_in_finalizer1): don't use IO.for_fd
    to close IO objects. it create IO object with already closed fd, and
    cause occasional Errno::EBADF in following tests. [ruby-core:45020]
    [Bug #6228]

Updated by luislavena (Luis Lavena) over 12 years ago

Hi Tomoyuki,

Please revert my change. As commented before, was unable to use your
suggested patch since the invalid file descriptor error 'moved' to a
different test.

I was unable to replicate the error from chkbuild.

Please revert and commit your change, will verify again later today.

Thank you.

Sorry for top posting. Sent from mobile.
On May 13, 2012 11:32 AM, "nagachika (Tomoyuki Chikanaga)" <
> wrote:

Issue #6228 has been updated by nagachika (Tomoyuki Chikanaga).

Hi Luis,

Sorry for a lack of enough explanations about my patch.

The ensure clause of test_flush_in_finalizer1 seems to try close all IO
objects opened above and not GC'ed & finalized using IO.for_fd. But it's
wrong usage of IO.for_fd. The scenario is as follows.

io = open("..")
fd = io.fileno
io = nil
IO.for_fd(fd).close # <- at this moment, io (if not GC'ed yet) has
invalid fd,
# because it's already closed by IO.for_fd(fd).close
new_io = open("...") # new_io might have same fd of io, because file
descriptor can be recycled.
... # => io is GC'ed and finalized here. Now new_io has closed fd!
new_io.gets # => raised Errno::EBADF

test_flush_in_finalizer1 is test for finalizer of IO (see #3910) and
GC.disable breaks it's intent.
I'm sorry, but I'd like to revert r35631.
And I'll commit my patch. Please check it.

Bug #6228: [mingw] Errno::EBADF in ruby/test_io.rb on ruby_1_9_3
https://bugs.ruby-lang.org/issues/6228#change-26601

Author: jonforums (Jon Forums)
Status: Assigned
Priority: Normal
Assignee: luislavena (Luis Lavena)
Category: test
Target version: 1.9.3
ruby -v: ruby 1.9.3p172 (2012-03-30 revision 35179) [i386-mingw32]

On Win7 32bit built with MinGW 4.6.2 using the RubyInstaller build
recipes, I get the following test error:

sh-3.1$ ruby --version
ruby 1.9.3p172 (2012-03-30 revision 35179) [i386-mingw32]

sh-3.1$ make test-all TESTS='openssl fiddle psych zlib io json mkmf
pathname stringio erb
fileutils ruby/test_io.rb ruby/test_io_m17n.rb ruby/test_file.rb'

  1. Error:
    test_lines(TestIO):
    Errno::EBADF: Bad file descriptor
    c:/Users/Jon/Documents/RubyDev/ruby-git/test/ruby/test_io.rb:33:in
    close' c:/Users/Jon/Documents/RubyDev/ruby-git/test/ruby/test_io.rb:33:in rescue in block in pipe'
    c:/Users/Jon/Documents/RubyDev/ruby-git/test/ruby/test_io.rb:30:in
    `block in pipe'

--
http://bugs.ruby-lang.org/

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0