Project

General

Profile

Bug #15986

`TestJIT#test_block_handler_with_possible_frame_omitted_inlining` fails on s390x and armv7hl

Added by vo.x (Vit Ondruch) about 1 year ago. Updated about 1 year ago.

Status:
Closed
Priority:
Normal
Target version:
-
ruby -v:
ruby -v: ruby 2.7.0dev (2019-07-04T10:34:08Z master d9f8b88b47) [s390x-linux]
[ruby-core:93542]

Description

I am trying to build the Ruby 2.7 snapshot for Fedora Rawhide 1, but I observe the following test failure on s390x and aarch64 platforms:

  1) Failure:
TestJIT#test_block_handler_with_possible_frame_omitted_inlining [/builddir/build/BUILD/ruby-2.7.0-d9f8b88b47/test/ruby/test_jit.rb:846]:
Expected 2 times of JIT success, but succeeded 1 times.
script:
"""
def multiply(a, b)
  a *= b
end
3.times do
  p multiply(7.0, 10.0)
end
"""
stderr:
"""
JIT success (65.9ms): block in <main>@-e:6 -> /tmp/_ruby_mjit_p54157u0.c
gcc: fatal error: output filename may not be empty
compilation terminated.
Successful MJIT finish
"""
.
<2> expected but was
<1>.
Finished tests in 440.892116s, 47.2746 tests/s, 6150.4071 assertions/s.

Files

build-s390x.log (1.29 MB) build-s390x.log s390x build log vo.x (Vit Ondruch), 07/10/2019 04:40 PM
build-armv7hl.log (1.26 MB) build-armv7hl.log armv7hl build log vo.x (Vit Ondruch), 07/10/2019 05:01 PM
build-x86_64.log (1.28 MB) build-x86_64.log x86_64 vo.x (Vit Ondruch), 07/10/2019 05:01 PM
mjit_debug.diff (1.69 KB) mjit_debug.diff k0kubun (Takashi Kokubun), 07/12/2019 01:07 PM
build-x86_64.log (2.76 KB) build-x86_64.log vo.x (Vit Ondruch), 07/12/2019 03:03 PM
build-s390x.log (3.03 KB) build-s390x.log vo.x (Vit Ondruch), 07/12/2019 03:03 PM
build-armv7hl.log (2.86 KB) build-armv7hl.log vo.x (Vit Ondruch), 07/12/2019 03:04 PM
mjit_debug2.diff (2.66 KB) mjit_debug2.diff k0kubun (Takashi Kokubun), 07/12/2019 03:48 PM
build-armv7hl.log (4.48 KB) build-armv7hl.log vo.x (Vit Ondruch), 07/15/2019 01:14 PM
build-s390x.log (5.06 KB) build-s390x.log vo.x (Vit Ondruch), 07/15/2019 01:14 PM
build-x86_64.log (5.09 KB) build-x86_64.log vo.x (Vit Ondruch), 07/15/2019 01:14 PM

Updated by k0kubun (Takashi Kokubun) about 1 year ago

  • Assignee set to k0kubun (Takashi Kokubun)
  • Status changed from Open to Assigned

Thanks to report. I'd like to know more about the context to fix the issue.

  • Does the error happen at the same place when you retry running the tests?
  • If so, could you share the output of the following command and all .c/.h files referenced in it?
$ ruby --disable-gems --jit-verbose=2 --jit-save-temps --jit-wait --jit-min-calls=2 -e "
def multiply(a, b)
  a *= b
end
3.times do
  p multiply(7.0, 10.0)
end
"

In my case, they were /home/k0kubun/.rbenv/versions/ruby/include/ruby-2.7.0/x86_64-linux/rb_mjit_min_header-2.7.0.h, /tmp/_ruby_mjit_p17484u0.c, and /tmp/_ruby_mjit_p17484u1.c. Also the output of ls -la /tmp after that may be also helpful.

Updated by vo.x (Vit Ondruch) about 1 year ago

I wish this was easier to debug. The problem is that this is test failure and it happens on build system, where I don't have access. Trying to reproduce it on my system, this does not work:

$ echo "
def multiply(a, b)
  a *= b
end
3.times do
  p multiply(7.0, 10.0)
end
" > test.rb

$ make runruby TESTRUN_SCRIPT="--disable-gems --jit-verbose=2 --jit-save-temps --jit-wait --jit-min-calls=2 test.rb"
./revision.h unchanged
./miniruby -I./lib -I. -I.ext/common  ./tool/runruby.rb --extout=.ext  -- --disable-gems --disable-gems --jit-verbose=2 --jit-save-temps --jit-wait --jit-min-calls=2 test.rb
MJIT: CC defaults to /usr/bin/gcc
MJIT: tmp_dir is /tmp
Cannot access header file: /usr/include/rb_mjit_min_header-2.7.0.h
Failure in MJIT header file name initialization

70.0
70.0
70.0

I have to probably patch the test case to provide me with the output of the files :/

Updated by k0kubun (Takashi Kokubun) about 1 year ago

  • Status changed from Assigned to Feedback

I see. Thanks for the information. At this moment I cannot do anything either, so I'll wait for you to collect the information from the CI system somehow.

Updated by mame (Yusuke Endoh) about 1 year ago

FYI: RubyCI platforms include RHEL 7.1 s390x and Ubuntu armv8 (aarch64), and their results are both green at the present time. So the cause would be the other factor than CPU, I guess.

Updated by vo.x (Vit Ondruch) about 1 year ago

So this is my hacked up test case:

$ git diff
diff --git a/test/ruby/test_jit.rb b/test/ruby/test_jit.rb
index 08494cbbbb..9ace7754d4 100644
--- a/test/ruby/test_jit.rb
+++ b/test/ruby/test_jit.rb
@@ -944,9 +944,15 @@ def assert_compile_once(script, result_inspect:, insns: [])
   end

   # Shorthand for normal test cases
-  def assert_eval_with_jit(script, stdout: nil, success_count:, min_calls: 1, insns: [], uplevel: 3)
-    out, err = eval_with_jit(script, verbose: 1, min_calls: min_calls)
+  def assert_eval_with_jit(script, stdout: nil, success_count:, min_calls: 2, insns: [], uplevel: 3)
+    out, err = eval_with_jit(script, verbose: 2, min_calls: min_calls, save_temps: true)
     actual = err.scan(/^#{JIT_SUCCESS_PREFIX}:/).size
+    puts "", "**********", "* rb_mjit_min_header-2.7.0.h", "---", ""
+    $stdout.flush
+    puts File.read(".ext/include/x86_64-linux/rb_mjit_min_header-2.7.0.h")
+    # puts File.read(".ext/include/armv7hl-linux/rb_mjit_min_header-2.7.0.h")
+    # puts File.read(".ext/include/s390x-linux/rb_mjit_min_header-2.7.0.h")
+    Dir.glob('/tmp/*.c').each {|f| puts '**********', "* #{f}", "", File.read(f), "---"; $stdout.flush}
     # Add --jit-verbose=2 logs for cl.exe because compiler's error message is suppressed
     # for cl.exe with --jit-verbose=1. See `start_process` in mjit_worker.c.
     if RUBY_PLATFORM.match?(/mswin/) && success_count != actual

And I run just the single test:

make test-all TESTS="test/ruby/test_jit.rb -n /test_block_handler_with_possible_frame_omitted_inlining/"

See the attached logs from s390x, armv7hl and x86_64 (apologies, some of the lines might be slightly intermingled but the build system, but I hope you can handle that).

BTW I was wrong saying that it fails on AArch64, because it actually fails on armv7hl

Updated by k0kubun (Takashi Kokubun) about 1 year ago

Thank you. All of the information help me a lot.

It seems that the command line construction is broken for the second compilation in build-armv7hl.log and build-s390x.log, while build-x86_64 seems okay. In this ticket, I attached "mjit_debug.diff" to collect more information on your build environments again. Could you share the build logs with it?

Updated by vo.x (Vit Ondruch) about 1 year ago

Here are the logs (bit messy again, but I hope you can get the information).

Updated by vo.x (Vit Ondruch) about 1 year ago

BTW a bit OT, but seeing all the information stored in the rb_mjit_min_header-2.7.0.h, I am not sure the JIT will work for binary distributions such as Fedora/RHEL. There appears to be embedded a lot of information about the machine used for build, while the JIT has to run on quite different machine.

Updated by k0kubun (Takashi Kokubun) about 1 year ago

Thank you for the next information. Could you also test the new "mjit_debug2.diff" which I attached now in the same way?

BTW a bit OT, but seeing all the information stored in the rb_mjit_min_header-2.7.0.h, I am not sure the JIT will work for binary distributions such as Fedora/RHEL.

MJIT's support policy is that the compiler for runtime MJIT compilation and its path must be the same as one used to build Ruby binary. Otherwise it's just out of support. Even if it's a binary distribution, you could also distribute a compiler as needed.

#10

Updated by k0kubun (Takashi Kokubun) about 1 year ago

  • Status changed from Feedback to Closed

Applied in changeset git|d8cc41c43be65dd4b17e7a6e38f5a7fdf2b247d6.


Fix a wrong buffer size to avoid stack corruption

[Bug #15986]

Updated by k0kubun (Takashi Kokubun) about 1 year ago

Fortunately a very similar issue was reproductive on my macOS machine. I did the mjit_debug2.diff investigation on my own, and noticed the issue fixed by d8cc41c43be65dd4b17e7a6e38f5a7fdf2b247d6. And the commit fixed the behavior on my machine. So I hope it's fixed on your environment too.

Updated by vo.x (Vit Ondruch) about 1 year ago

Here are the logs again. I am going to try the latest master and will report back if that helps.

Updated by vo.x (Vit Ondruch) about 1 year ago

I did several build with 0c6c937904 and all passed. Thx for the fix.

Also available in: Atom PDF