Project

General

Profile

Bug #5577

test/testunit/test_parallel.rb causes NoMethodError when file descriptor is limited to 30.

Added by akr (Akira Tanaka) about 8 years ago. Updated over 7 years ago.

Status:
Assigned
Priority:
Normal
Target version:
-
ruby -v:
ruby 2.0.0dev (2011-11-06 trunk 33645) [x86_64-linux]
Backport:
[ruby-dev:44802]

Description

ulimit -n 30 として、file descriptor を 30個に制限した状態で、
test/testunit/test_parallel.rb をテストすると以下のように失敗します。

失敗する事自体は問題ではないのですが、
undefined method close' for nil:NilClass (NoMethodError)
NoMethodError: undefined method
chomp' for nil:NilClass
などと存在しないメソッドを呼んでしまっているのはよろしくないのではないでしょうか。

失敗の原因がわかるメッセージが出ると良いのではないかと思います。

% (ulimit -n 30; make test-all TESTS='test/testunit/test_parallel.rb')
./miniruby -I./lib -I. -I.ext/common ./tool/runruby.rb --extout=.ext -- --disable-gems "./test/runner.rb" --ruby="./miniruby -I./lib -I. -I.ext/common ./tool/runruby.rb --extout=.ext -- --disable-gems" test/testunit/test_parallel.rb
Run options: "--ruby=./miniruby -I./lib -I. -I.ext/common ./tool/runruby.rb --extout=.ext -- --disable-gems"

# Running tests:

.FFFF/home/akr/ruby/tst2/ruby/test/testunit/../../lib/test/unit/parallel.rb:137:in ensure in run': undefined methodclose' for nil:NilClass (NoMethodError)
from /home/akr/ruby/tst2/ruby/test/testunit/../../lib/test/unit/parallel.rb:138:in run'
from /home/akr/ruby/tst2/ruby/test/testunit/../../lib/test/unit/parallel.rb:156:in
'
F/home/akr/ruby/tst2/ruby/test/testunit/../../lib/test/unit/parallel.rb:137:in ensure in run': undefined methodclose' for nil:NilClass (NoMethodError)
from /home/akr/ruby/tst2/ruby/test/testunit/../../lib/test/unit/parallel.rb:138:in run'
from /home/akr/ruby/tst2/ruby/test/testunit/../../lib/test/unit/parallel.rb:156:in
'
F/home/akr/ruby/tst2/ruby/test/testunit/../../lib/test/unit/parallel.rb:137:in ensure in run': undefined methodclose' for nil:NilClass (NoMethodError)
from /home/akr/ruby/tst2/ruby/test/testunit/../../lib/test/unit/parallel.rb:138:in run'
from /home/akr/ruby/tst2/ruby/test/testunit/../../lib/test/unit/parallel.rb:156:in
'
E/home/akr/ruby/tst2/ruby/test/testunit/../../lib/test/unit/parallel.rb:137:in ensure in run': undefined methodclose' for nil:NilClass (NoMethodError)
from /home/akr/ruby/tst2/ruby/test/testunit/../../lib/test/unit/parallel.rb:138:in run'
from /home/akr/ruby/tst2/ruby/test/testunit/../../lib/test/unit/parallel.rb:156:in
'
F/home/akr/ruby/tst2/ruby/test/testunit/../../lib/test/unit/parallel.rb:137:in ensure in run': undefined methodclose' for nil:NilClass (NoMethodError)
from /home/akr/ruby/tst2/ruby/test/testunit/../../lib/test/unit/parallel.rb:138:in run'
from /home/akr/ruby/tst2/ruby/test/testunit/../../lib/test/unit/parallel.rb:156:in
'
F/home/akr/ruby/tst2/ruby/test/testunit/../../lib/test/unit/parallel.rb:137:in ensure in run': undefined methodclose' for nil:NilClass (NoMethodError)
from /home/akr/ruby/tst2/ruby/test/testunit/../../lib/test/unit/parallel.rb:138:in run'
from /home/akr/ruby/tst2/ruby/test/testunit/../../lib/test/unit/parallel.rb:156:in
'
F

Finished tests in 5.751803s, 1.9124 tests/s, 1.9124 assertions/s.

1) Failure:

test_jobs_status(TestParallel::TestParallel) [/home/akr/ruby/tst2/ruby/test/testunit/test_parallel.rb:177]:
Expected /\d+=ptest_(first|second|third|forth) */ to match "Run options: --ruby \"./miniruby -I./lib -I. -I.ext/common ./tool/runruby.rb --extout=.ext -- --disable-gems\" -j t1 --jobs-status\n\n# Running tests:\n\n/home/akr/ruby/tst2/ruby/lib/test/unit/parallel.rb:137:in ensure in run': undefined methodclose' for nil:NilClass (NoMethodError)\n\tfrom /home/akr/ruby/tst2/ruby/lib/test/unit/parallel.rb:138:in run'\n\tfrom /home/akr/ruby/tst2/ruby/lib/test/unit/parallel.rb:156:in'\n\nSome worker was crashed. It seems ruby interpreter's bug\nor, a bug of test/unit/parallel.rb. try again without -j\noption.\n\n".

2) Failure:

test_no_retry_option(TestParallel::TestParallel) [/home/akr/ruby/tst2/ruby/test/testunit/test_parallel.rb:171]:
Expected /^ +\d+) Failure:\ntest_fail_at_worker(TestD)/ to match "Run options: --ruby \"./miniruby -I./lib -I. -I.ext/common ./tool/runruby.rb --extout=.ext -- --disable-gems\" -j t1 --no-retry\n\n# Running tests:\n\n/home/akr/ruby/tst2/ruby/lib/test/unit/parallel.rb:137:in ensure in run': undefined methodclose' for nil:NilClass (NoMethodError)\n\tfrom /home/akr/ruby/tst2/ruby/lib/test/unit/parallel.rb:138:in run'\n\tfrom /home/akr/ruby/tst2/ruby/lib/test/unit/parallel.rb:156:in'\n\nSome worker was crashed. It seems ruby interpreter's bug\nor, a bug of test/unit/parallel.rb. try again without -j\noption.\n\n".

3) Failure:

test_should_retry_failed_on_workers(TestParallel::TestParallel) [/home/akr/ruby/tst2/ruby/test/testunit/test_parallel.rb:164]:
Expected /Retrying.+$/ to match "Run options: --ruby \"./miniruby -I./lib -I. -I.ext/common ./tool/runruby.rb --extout=.ext -- --disable-gems\" -j t1\n\n# Running tests:\n\n/home/akr/ruby/tst2/ruby/lib/test/unit/parallel.rb:137:in ensure in run': undefined methodclose' for nil:NilClass (NoMethodError)\n\tfrom /home/akr/ruby/tst2/ruby/lib/test/unit/parallel.rb:138:in run'\n\tfrom /home/akr/ruby/tst2/ruby/lib/test/unit/parallel.rb:156:in'\n\nSome worker was crashed. It seems ruby interpreter's bug\nor, a bug of test/unit/parallel.rb. try again without -j\noption.\n\n".

4) Failure:

test_should_run_all_without_any_leaks(TestParallel::TestParallel) [/home/akr/ruby/tst2/ruby/test/testunit/test_parallel.rb:158]:
Expected /[SF.]{7}$/ to match "Run options: --ruby \"./miniruby -I./lib -I. -I.ext/common ./tool/runruby.rb --extout=.ext -- --disable-gems\" -j t1\n\n# Running tests:\n\n/home/akr/ruby/tst2/ruby/lib/test/unit/parallel.rb:137:in ensure in run': undefined methodclose' for nil:NilClass (NoMethodError)\n\tfrom /home/akr/ruby/tst2/ruby/lib/test/unit/parallel.rb:138:in run'\n\tfrom /home/akr/ruby/tst2/ruby/lib/test/unit/parallel.rb:156:in'\n\nSome worker was crashed. It seems ruby interpreter's bug\nor, a bug of test/unit/parallel.rb. try again without -j\noption.\n\n".

5) Failure:

test_accept_run_command_multiple_times(TestParallel::TestParallelWorker) [/home/akr/ruby/tst2/ruby/test/testunit/test_parallel.rb:64]:
Expected /ready/ to match nil.

6) Failure:

test_run(TestParallel::TestParallelWorker) [/home/akr/ruby/tst2/ruby/test/testunit/test_parallel.rb:40]:
Expected /ready/ to match nil.

7) Failure:

test_run_multiple_testcase_in_one_file(TestParallel::TestParallelWorker) [/home/akr/ruby/tst2/ruby/test/testunit/test_parallel.rb:51]:
Expected /ready/ to match nil.

8) Failure:

test_quit(TestParallel::TestParallelWorker) [/home/akr/ruby/tst2/ruby/test/testunit/test_parallel.rb:116]:
Expected /bye$/m to match "".

9) Failure:

test_done(TestParallel::TestParallelWorker) [/home/akr/ruby/tst2/ruby/test/testunit/test_parallel.rb:96]:
Expected /done (.+?)$/ to match nil.

10) Error:
test_p(TestParallel::TestParallelWorker):
NoMethodError: undefined method chomp' for nil:NilClass
/home/akr/ruby/tst2/ruby/test/testunit/test_parallel.rb:86:in
block in test_p'

11 tests, 11 assertions, 9 failures, 1 errors, 0 skips

ruby -v: ruby 2.0.0dev (2011-11-06 trunk 33645) [x86_64-linux]
make: *** [yes-test-all] エラー 10
zsh: exit 2 ( ulimit -n 30; make test-all TESTS='test/testunit/test_parallel.rb'; )
% ./ruby -v
ruby 2.0.0dev (2011-11-06 trunk 33645) [x86_64-linux]

Associated revisions

Revision 42b1df08
Added by sorah (Sorah Fukumori) over 7 years ago

  • lib/test/unit.rb: Put error message into STDERR if failed to lanch
    worker (job) process. [ruby-dev:44802] [Bug #5577]

  • lib/test/unit/parallel.rb: If failed to increment_io, exit with code

    1. [ruby-dev:44802] [Bug #5577]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@34968 b2dd03c8-39d4-4d8f-98ff-823fe69b080e

Revision 34968
Added by sorah (Sorah Fukumori) over 7 years ago

  • lib/test/unit.rb: Put error message into STDERR if failed to lanch
    worker (job) process. [ruby-dev:44802] [Bug #5577]

  • lib/test/unit/parallel.rb: If failed to increment_io, exit with code

    1. [ruby-dev:44802] [Bug #5577]

Revision 34968
Added by sorah (Sorah Fukumori) over 7 years ago

  • lib/test/unit.rb: Put error message into STDERR if failed to lanch
    worker (job) process. [ruby-dev:44802] [Bug #5577]

  • lib/test/unit/parallel.rb: If failed to increment_io, exit with code

    1. [ruby-dev:44802] [Bug #5577]

Revision 34968
Added by sorah (Sorah Fukumori) over 7 years ago

  • lib/test/unit.rb: Put error message into STDERR if failed to lanch
    worker (job) process. [ruby-dev:44802] [Bug #5577]

  • lib/test/unit/parallel.rb: If failed to increment_io, exit with code

    1. [ruby-dev:44802] [Bug #5577]

Revision 34968
Added by sorah (Sorah Fukumori) over 7 years ago

  • lib/test/unit.rb: Put error message into STDERR if failed to lanch
    worker (job) process. [ruby-dev:44802] [Bug #5577]

  • lib/test/unit/parallel.rb: If failed to increment_io, exit with code

    1. [ruby-dev:44802] [Bug #5577]

Revision 34968
Added by sorah (Sorah Fukumori) over 7 years ago

  • lib/test/unit.rb: Put error message into STDERR if failed to lanch
    worker (job) process. [ruby-dev:44802] [Bug #5577]

  • lib/test/unit/parallel.rb: If failed to increment_io, exit with code

    1. [ruby-dev:44802] [Bug #5577]

Revision 34968
Added by sorah (Sorah Fukumori) over 7 years ago

  • lib/test/unit.rb: Put error message into STDERR if failed to lanch
    worker (job) process. [ruby-dev:44802] [Bug #5577]

  • lib/test/unit/parallel.rb: If failed to increment_io, exit with code

    1. [ruby-dev:44802] [Bug #5577]

History

Updated by znz (Kazuhiro NISHIYAMA) about 8 years ago

  • Assignee set to sorah (Sorah Fukumori)
#2

Updated by sorah (Sorah Fukumori) over 7 years ago

  • Status changed from Open to Closed
  • % Done changed from 0 to 100

This issue was solved with changeset r34968.
Akira, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.


  • lib/test/unit.rb: Put error message into STDERR if failed to lanch
    worker (job) process. [ruby-dev:44802] [Bug #5577]

  • lib/test/unit/parallel.rb: If failed to increment_io, exit with code

    1. [ruby-dev:44802] [Bug #5577]

Updated by sorah (Sorah Fukumori) over 7 years ago

  • Status changed from Closed to Assigned

とりあえず根本の原因である worker の起動に失敗したときはエラーメッセージを表示して exit 1 するようにしました。 (r34968)

ただ、(ulimit -n 30; make test-all TESTS='test/testunit/test_parallel.rb') は治す前から OS X だと刺さって
timeout 待ちになるのでそこを調査してます。

lib/test/unit.rb:464 の while _io=IO.select(@ios)[0] の行の上に p @workers を入れて、以下のようにテストすると

$ make install-nodoc
$ cat ../../test.rb
ARGV = ["-j1"]
require "#{File.dirname(FILE)/test/testunit/tests_for_parallel/runner"
$ ( ulimit -n 15; make gdb-ruby )
gdb -x run.gdb --quiet --args ruby ../../test.rb
Reading symbols for shared libraries .... done
Breakpoint 1 at 0x20c49ba5d5e854: file ../../debug.c, line 137.
Reading symbols for shared libraries +++............................ done
Reading symbols for shared libraries . done
Reading symbols for shared libraries . done
../../test.rb:1: warning: already initialized constant ARGV
Run options: -j1

Running tests:

[15463:waiting]
C
Program received signal SIGINT, Interrupt.
0x000000010006aece in rb_newobj () at ../../gc.c:1328
1328 objspace->heap.free_slots->freelist = RANY(obj)->as.free.next;
Current language: auto; currently minimal
(gdb) rb_backtrace
from ../../test.rb:2:in <main>'
from /.../ruby/builds/trunk/local/lib/ruby/2.0.0/rubygems/custom_require.rb:36:in
require'
from /.../ruby/builds/trunk/local/lib/ruby/2.0.0/rubygems/custom_require.rb:36:in require'
from /.../ruby/test/testunit/tests_for_parallel/runner.rb:10:in
'
from /.../ruby/builds/trunk/local/lib/ruby/2.0.0/test/unit.rb:696:in run'
from /.../ruby/builds/trunk/local/lib/ruby/2.0.0/test/unit.rb:692:in
run'
from /.../ruby/builds/trunk/local/lib/ruby/2.0.0/test/unit.rb:660:in run'
from /.../ruby/builds/trunk/local/lib/ruby/2.0.0/test/unit.rb:21:in
run'
from /.../ruby/builds/trunk/local/lib/ruby/2.0.0/minitest/unit.rb:911:in run'
from /.../ruby/builds/trunk/local/lib/ruby/2.0.0/minitest/unit.rb:922:in
_run'
from /.../ruby/builds/trunk/local/lib/ruby/2.0.0/minitest/unit.rb:922:in each'
from /.../ruby/builds/trunk/local/lib/ruby/2.0.0/minitest/unit.rb:923:in
block in _run'
from /.../ruby/builds/trunk/local/lib/ruby/2.0.0/minitest/unit.rb:936:in run_tests'
from /.../ruby/builds/trunk/local/lib/ruby/2.0.0/minitest/unit.rb:773:in
_run_anything'
from /.../ruby/builds/trunk/local/lib/ruby/2.0.0/test/unit.rb:614:in _run_suites'
from /.../ruby/builds/trunk/local/lib/ruby/2.0.0/test/unit.rb:465:in
_run_parallel'
from /.../ruby/builds/trunk/local/lib/ruby/2.0.0/test/unit.rb:465:in `select'

while _io=IO.select(@ios)[0] の行で動作が止まり終了してくれない状態です。
p @workers の出力 [15463:waiting] は、 Worker#to_s の出力で "pid:状態" というフォーマットになっており、
pid 15463 で waiting の状態 (worker が起動した後最初に worker が出力 ready という文字列を待っている状態) である事がわかります。
また @ios は @workers.map(&:io) で、worker プロセスとのパイプ (IO.popen で生成) の配列になっています。

C する前に別のターミナルで ps 15463 して確認すると既に worker が終了しているので、
何故 @ios に含まれる IO に EOF が来ないか、というのを調べています。

Also available in: Atom PDF