Project

General

Profile

Bug #15522

TestJIT#test_compile_insn_local fails on aarch64 RHEL7

Added by vo.x (Vit Ondruch) over 1 year ago. Updated over 1 year ago.

Status:
Closed
Priority:
Normal
Target version:
-
ruby -v:
ruby 2.6.0p0 (2018-12-25 revision 66547) [aarch64-linux]
[ruby-core:90986]

Description

Trying to build Ruby 2.6 on RHEL7, I observe the following test failure on RHEL7, but just on aarch64. The other platforms pass just fine:

  1) Failure:
TestJIT#test_compile_insn_local [/builddir/build/BUILD/ruby-2.6.0/test/ruby/test_jit.rb:64]:
Expected 3 times of JIT success, but succeeded 2 times.
script:

def foo
a = 0
[1, 2].each do |i|
a += i
[3, 4].each do |j|
a *= j
end
end
a
end
print foo

stderr:

JIT success (276.6ms): foo@-e:2 -> /tmp/_ruby_mjit_p20163u0.c
JIT success (347.3ms): block in foo@-e:4 -> /tmp/_ruby_mjit_p20163u1.c
MJIT warning: failure in loading code from '/tmp/_ruby_mjit_p20163u2.so': /tmp/_ruby_mjit_p20163u2.so: undefined symbol: __multi3
Successful MJIT finish

.
<3> expected but was
<2>.

I suspect this must be some combination of architecture/compiler (gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-16)), because I don't observe similar issues on Fedora.


Files

_ruby_mjit_p207u2.c (3.48 KB) _ruby_mjit_p207u2.c vo.x (Vit Ondruch), 01/10/2019 11:47 AM
rb_mjit_min_header-2.6.0.h (781 KB) rb_mjit_min_header-2.6.0.h vo.x (Vit Ondruch), 01/11/2019 09:58 AM
mjit_multi3.tgz (993 KB) mjit_multi3.tgz vo.x (Vit Ondruch), 01/11/2019 09:59 AM

Updated by vo.x (Vit Ondruch) over 1 year ago

Also, it would be nice if the JIT output used different markup, which does not collide with Redmine markup :/

Updated by sharkcz (Dan Horák) over 1 year ago

IMHO either something calls __multi3 and doesn't link to libgcc or gcc emits call to __multi3 and it's not implemented in libgcc.
Would be useful to see /tmp/_ruby_mjit_p20163u1.c source and the command line used to compile/link it.

Updated by vo.x (Vit Ondruch) over 1 year ago

The smaller reproducer probably is:

$ ruby --disable-gems --jit-verbose=10 --jit-min-calls=1 --jit-debug --jit-wait --jit-save-temps -e "
    begin
      def foo
        a = 0
        [1, 2].each do |i|
          a += i
          [3, 4].each do |j|
            a *= j
          end
        end
        a
      end

      print foo
    end"

Updated by vo.x (Vit Ondruch) over 1 year ago

This should be the C code (although generated on my x86_64, because I don't have aarch64 readily available).

Updated by k0kubun (Takashi Kokubun) over 1 year ago

Also, it would be nice if the JIT output used different markup, which does not collide with Redmine markup :/

Yeah. I'll change that later.

This should be the C code (although generated on my x86_64, because I don't have aarch64 readily available).

How did you get the output in the ticket description? If possible, I want you to upload rb_mjit_min_header-2.6.0.h under install directory and the exact /tmp/_ruby_mjit_p20163u1.c whose .so can't be loaded.

Updated by k0kubun (Takashi Kokubun) over 1 year ago

  • Assignee set to k0kubun (Takashi Kokubun)

Updated by vo.x (Vit Ondruch) over 1 year ago

  • Assignee deleted (k0kubun (Takashi Kokubun))

k0kubun (Takashi Kokubun) wrote:

Also, it would be nice if the JIT output used different markup, which does not collide with Redmine markup :/

Yeah. I'll change that later.

Thx

This should be the C code (although generated on my x86_64, because I don't have aarch64 readily available).

How did you get the output in the ticket description?

That is from builder which is not easily accessible. I can throw in SRPM and get resulting logs and RPMs, but I cannot easily access the intermediate results :/

If possible, I want you to upload rb_mjit_min_header-2.6.0.h under install directory and the exact /tmp/_ruby_mjit_p20163u1.c whose .so can't be loaded.

I will see what I can do about it. Hopefully, sharkcz (Dan Horák) will be able to help :)

Updated by k0kubun (Takashi Kokubun) over 1 year ago

  • Status changed from Open to Feedback

Thank you. I'll wait for that first.
Also having logs with --jit-verbose=2 version of https://bugs.ruby-lang.org/issues/15522#note-3 would be helpful.

P.S. The output format is changed in r66781.

Updated by vo.x (Vit Ondruch) over 1 year ago

So here is the output:

$ ruby --disable-gems --jit-verbose=2 --jit-min-calls=1 --jit-debug --jit-wait --jit-save-temps -e "
    begin
      def foo
        a = 0
        [1, 2].each do |i|
          a += i
          [3, 4].each do |j|
            a *= j
          end
        end
        a
      end

      print foo
    end"
MJIT: CC defaults to /usr/bin/gcc
MJIT: tmp_dir is /tmp
Creating precompiled header
Starting process: /usr/bin/gcc /usr/bin/gcc -w -Wfatal-errors -fPIC -shared -w -pipe -ggdb3 -o /tmp/_ruby_mjit_hp111u0.h.gch /opt/rh/rh-ruby26/root/usr/include/rb_mjit_min_header-2.6.0.h
start compilation: foo@-e:3 -> /tmp/_ruby_mjit_p111u0.c
Starting process: /usr/bin/gcc /usr/bin/gcc -w -Wfatal-errors -fPIC -shared -w -pipe -ggdb3 -o /tmp/_ruby_mjit_p111u0.o /tmp/_ruby_mjit_p111u0.c -c -Wl,-z,relro -nostartfiles -nodefaultlibs -nostdlib
Starting process: /usr/bin/gcc /usr/bin/gcc -shared -Wfatal-errors -fPIC -shared -w -pipe -ggdb3 -o /tmp/_ruby_mjit_p111u0.so /tmp/_ruby_mjit_p111u0.o -Wl,-z,relro -nostartfiles -nodefaultlibs -nostdlib
MJIT warning: failure in loading code from '/tmp/_ruby_mjit_p111u0.so': /tmp/_ruby_mjit_p111u0.so: undefined symbol: __multi3
start compilation: block in foo@-e:5 -> /tmp/_ruby_mjit_p111u1.c
Starting process: /usr/bin/gcc /usr/bin/gcc -w -Wfatal-errors -fPIC -shared -w -pipe -ggdb3 -o /tmp/_ruby_mjit_p111u1.o /tmp/_ruby_mjit_p111u1.c -c -Wl,-z,relro -nostartfiles -nodefaultlibs -nostdlib
Starting process: /usr/bin/gcc /usr/bin/gcc -shared -Wfatal-errors -fPIC -shared -w -pipe -ggdb3 -o /tmp/_ruby_mjit_p111u1.so /tmp/_ruby_mjit_p111u1.o -Wl,-z,relro -nostartfiles -nodefaultlibs -nostdlib
MJIT warning: failure in loading code from '/tmp/_ruby_mjit_p111u1.so': /tmp/_ruby_mjit_p111u1.so: undefined symbol: __multi3
start compilation: block (2 levels) in foo@-e:7 -> /tmp/_ruby_mjit_p111u2.c
Starting process: /usr/bin/gcc /usr/bin/gcc -w -Wfatal-errors -fPIC -shared -w -pipe -ggdb3 -o /tmp/_ruby_mjit_p111u2.o /tmp/_ruby_mjit_p111u2.c -c -Wl,-z,relro -nostartfiles -nodefaultlibs -nostdlib
Starting process: /usr/bin/gcc /usr/bin/gcc -shared -Wfatal-errors -fPIC -shared -w -pipe -ggdb3 -o /tmp/_ruby_mjit_p111u2.so /tmp/_ruby_mjit_p111u2.o -Wl,-z,relro -nostartfiles -nodefaultlibs -nostdlib
MJIT warning: failure in loading code from '/tmp/_ruby_mjit_p111u2.so': /tmp/_ruby_mjit_p111u2.so: undefined symbol: __multi3
168Stopping worker thread
Successful MJIT finish

Header and the content of /tmp directory (except the .gch, which was huge :/) are attached.

k0kubun (Takashi Kokubun) wrote:

P.S. The output format is changed in r66781.

Nice, thanks!

Updated by sharkcz (Dan Horák) over 1 year ago

I suppose "-nostartfiles -nodefaultlibs -nostdlib" are the reason that libgcc isn't linked into the _ruby_mjit_p111u0.so

__multi3 is provided in /lib64/libgcc_s-4.8.5-20150702.so.1

I guess -nodefaultlibs should be omitted and/or -static-libgcc added. Skipping libgcc seems dangerous in general on any arch.

Updated by k0kubun (Takashi Kokubun) over 1 year ago

  • Assignee set to k0kubun (Takashi Kokubun)
  • Status changed from Open to Assigned

Header and the content of /tmp directory (except the .gch, which was huge :/) are attached.

Thanks! Much appreciated.

I suppose "-nostartfiles -nodefaultlibs -nostdlib" are the reason that libgcc isn't linked into the _ruby_mjit_p111u0.so

__multi3 is provided in /lib64/libgcc_s-4.8.5-20150702.so.1

Possibly. It's causing issues on #15513 as well... I'll try to fix it.

Updated by vo.x (Vit Ondruch) over 1 year ago

sharkcz (Dan Horák) wrote:

I guess -nodefaultlibs should be omitted and/or -static-libgcc added. Skipping libgcc seems dangerous in general on any arch.

I removed -nodefaultlibs and it changed nothing. But reading GCC options 1 manual, the -nostdlib description, it says:

In other words, when you specify -nostdlib or -nodefaultlibs you should usually specify -lgcc as well.

and indeed after adding -lgcc, the build passed. Also, the -lgcc is already used in some cases 2, maybe it should be used in all cases ...

#13

Updated by k0kubun (Takashi Kokubun) over 1 year ago

  • Status changed from Assigned to Closed

Applied in changeset trunk|r66812.


mjit_worker.c: pass -lgcc to GCC platforms

using -nodefaultlibs -nostdlib.

I assume libgcc is needed when we use -nostdlib, and it's linked on some
platforms but not linked on some platforms (like aarch64, and possibly
AIX as well) as said in https://wiki.osdev.org/Libgcc :

You can link with libgcc by passing -lgcc when linking your kernel
with your compiler. You don't need to do this unless you pass the
-nodefaultlibs option (implied by -nostdlib)

Also note that -nostdlib is not strictly needed (rather implied
-nodefaultlibs is problematic for Gentoo like Bug#15513, which will be
approached later) but helpful for performance. So I want to keep it for
now.

[Bug #15522]

I'm not trying to add -nodefaultlibs -nostdlib for AIX in this commit
because AIX RubyCI is dead right now, but I'll try to add them again
once RubyCI is fixed.

Updated by k0kubun (Takashi Kokubun) over 1 year ago

vo.x (Vit Ondruch) Could you check if r66812 works?

#15

Updated by k0kubun (Takashi Kokubun) over 1 year ago

  • Backport changed from 2.4: UNKNOWN, 2.5: UNKNOWN, 2.6: UNKNOWN to 2.4: DONTNEED, 2.5: DONTNEED, 2.6: REQUIRED

Updated by vo.x (Vit Ondruch) over 1 year ago

k0kubun (Takashi Kokubun) wrote:

vo.x (Vit Ondruch) Could you check if r66812 works?

Actually it is r66811 + r66812. Applying these two patches, the test suite passes on RHEL7 on all supported architectures and it keeps passing on Fedora Rawhide on all supported architectures. Thx for the fix.

Updated by naruse (Yui NARUSE) over 1 year ago

  • Backport changed from 2.4: DONTNEED, 2.5: DONTNEED, 2.6: REQUIRED to 2.4: DONTNEED, 2.5: DONTNEED, 2.6: DONE

ruby_2_6 r66849 merged revision(s) 66811,66812,66816.

Also available in: Atom PDF