Bug #17871
closedTestGCCompact#test_ast_compacts test failing again
Description
This issue was found by @mame (Yusuke Endoh) yesterday on our new Power 9 server. I would like to open the ticket.
The test failure was reported and fixed on the #17306 6 months ago.
However on the latest master adcbae8d49ec04d365ce13274783b1495c3c7d0e
, Power 9 (ppc64le) Ubuntu focal, I see the TestGCCompact#test_ast_compacts test fails.
$ lscpu | head -3
Architecture: ppc64le
Byte Order: Little Endian
CPU(s): 8
$ lscpu | grep ^Model
Model: 2.2 (pvr 004e 1202)
Model name: POWER9 (architected), altivec supported
$ uname -m
ppc64le
$ cat /etc/os-release | head -3
NAME="Ubuntu"
VERSION="20.04.2 LTS (Focal Fossa)"
ID=ubuntu
$ gcc --version
gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
$ autoconf
$ ./configure \
--prefix=${HOME}/local/ruby-master-adcbae8 \
--enable-shared
$ make
$ make install
$ make check 2>&1 | tee check.log
...
<internal:gc>:213: [BUG] Couldn't unprotect page 0x00000a1b33ba8000
ruby 3.1.0dev (2021-05-19T05:24:01Z master adcbae8d49) [powerpc64le-linux]
-- Control frame information -----------------------------------------------
c:0031 p:0003 s:0174 e:000173 METHOD <internal:gc>:213
c:0030 p:0026 s:0170 e:000169 METHOD /home/jaruga/git/ruby/ruby/test/ruby/test_gc_compact.rb:154
...
c:0001 p:0000 s:0003 E:001550 (none) [FINISH]
...
-- C level backtrace information -------------------------------------------
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(rb_vm_bugreport+0x1b4) [0x7441be9c78f4] vm_dump.c:759
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(rb_bug_without_die+0x9c) [0x7441be740dec] error.c:777
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(die+0x0) [0x7441be6807dc] error.c:785
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(rb_bug) error.c:787
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(unlock_page_body+0x10) [0x7441be771e80] gc.c:4889
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(gc_fill_swept_page) gc.c:5184
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(gc_page_sweep) gc.c:5392
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(gc_sweep_step) gc.c:5562
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(gc_sweep_rest+0x24) [0x7441be77208c] gc.c:5619
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(gc_sweep) gc.c:5737
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(gc_marks+0x15c) [0x7441be778d08] gc.c:8024
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(gc_start) gc.c:8854
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(rb_multi_ractor_p+0x0) [0x7441be77aa38] gc.c:8742
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(rb_vm_lock_leave) vm_sync.h:92
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(garbage_collect) gc.c:8744
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(gc_start_internal) gc.c:9086
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(gc_compact) gc.c:10002
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(builtin_invoker0+0x24) [0x7441be988d34] vm_insnhelper.c:5429
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(vm_exec_core+0x1914) [0x7441be9a8b14] vm_insnhelper.c:5569
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(rb_vm_exec+0x14c) [0x7441be9acfdc] vm.c:2169
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(rb_yield+0x288) [0x7441be9b23c8] vm.c:1260
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(rb_ary_collect+0x74) [0x7441be68ca24] array.c:3646
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(ractor_safe_call_cfunc_0+0x24) [0x7441be9885b4] vm_insnhelper.c:2760
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(vm_call_cfunc_with_frame+0x140) [0x7441be993d30] vm_insnhelper.c:2943
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(vm_sendish+0x364) [0x7441be9a2ad4] vm_insnhelper.c:4516
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(vm_exec_core+0x294) [0x7441be9a7494] insns.def:754
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(rb_vm_exec+0x14c) [0x7441be9acfdc] vm.c:2169
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(rb_yield+0x288) [0x7441be9b23c8] vm.c:1260
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(rb_ary_each+0x54) [0x7441be683844] array.c:2534
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(ractor_safe_call_cfunc_0+0x24) [0x7441be9885b4] vm_insnhelper.c:2760
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(vm_call_cfunc_with_frame+0x140) [0x7441be993d30] vm_insnhelper.c:2943
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(vm_call_method_each_type+0xc0) [0x7441be9ae6d0] vm_insnhelper.c:3433
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(vm_call_method+0xdc) [0x7441be9aeeec] vm_insnhelper.c:3537
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(vm_call_method_each_type+0x510) [0x7441be9aeb20] vm_insnhelper.c:3412
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(vm_call_method+0xdc) [0x7441be9aeeec] vm_insnhelper.c:3537
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(vm_sendish+0x364) [0x7441be9a2ad4] vm_insnhelper.c:4516
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(vm_exec_core+0x294) [0x7441be9a7494] insns.def:754
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(rb_vm_exec+0x14c) [0x7441be9acfdc] vm.c:2169
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(rb_yield+0x288) [0x7441be9b23c8] vm.c:1260
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(rb_ary_each+0x54) [0x7441be683844] array.c:2534
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(ractor_safe_call_cfunc_0+0x24) [0x7441be9885b4] vm_insnhelper.c:2760
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(vm_call_cfunc_with_frame+0x140) [0x7441be993d30] vm_insnhelper.c:2943
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(vm_call_method_each_type+0xc0) [0x7441be9ae6d0] vm_insnhelper.c:3433
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(vm_call_method+0xdc) [0x7441be9aeeec] vm_insnhelper.c:3537
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(vm_call_method_each_type+0x510) [0x7441be9aeb20] vm_insnhelper.c:3412
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(vm_call_method+0xdc) [0x7441be9aeeec] vm_insnhelper.c:3537
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(vm_sendish+0x364) [0x7441be9a2ad4] vm_insnhelper.c:4516
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(vm_exec_core+0x294) [0x7441be9a7494] insns.def:754
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(rb_vm_exec+0x14c) [0x7441be9acfdc] vm.c:2169
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(rb_iseq_eval+0x190) [0x7441be9b0bf0] vm.c:2406
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(require_internal+0x998) [0x7441be7cc8c8] load.c:594
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(rb_require_string+0x44) [0x7441be7cd8f4] load.c:1142
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(rb_f_require_relative+0x48) [0x7441be7cd9e8] load.c:857
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(ractor_safe_call_cfunc_1+0x28) [0x7441be988608] vm_insnhelper.c:2767
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(vm_call_cfunc_with_frame+0x140) [0x7441be993d30] vm_insnhelper.c:2943
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(vm_call_method_each_type+0xc0) [0x7441be9ae6d0] vm_insnhelper.c:3433
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(vm_call_method+0xdc) [0x7441be9aeeec] vm_insnhelper.c:3537
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(vm_exec_core+0x1aac) [0x7441be9a8cac] vm_insnhelper.c:4516
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(rb_vm_exec+0x14c) [0x7441be9acfdc] vm.c:2169
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(rb_iseq_eval_main+0xf0) [0x7441be9b0d50] vm.c:2417
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(rb_ec_exec_node+0xb8) [0x7441be74a218] eval.c:317
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(ruby_run_node+0x7c) [0x7441be74e94c] eval.c:375
/home/jaruga/git/ruby/ruby/ruby(main+0x90) [0xa1b10ce0bb0] ./main.c:47
...
make: *** [uncommon.mk:802: yes-test-all] Aborted (core dumped)
$ make test-all TESTOPTS="-n test_compact_count" TESTS=test/ruby/test_gc_compact.rb
...
# Running tests:
[1/1] TestGCCompact#test_compact_count<internal:gc>:213: [BUG] Couldn't protect page 0x00000f8747228000
ruby 3.1.0dev (2021-05-19T05:24:01Z master adcbae8d49) [powerpc64le-linux]
-- Control frame information -----------------------------------------------
c:0031 p:0003 s:0174 e:000173 METHOD <internal:gc>:213
<internal:gc>:213: [BUG] Couldn't unprotect page 0x00000f8747234000
ruby 3.1.0dev (2021-05-19T05:24:01Z master adcbae8d49) [powerpc64le-linux]
-- Control frame information -----------------------------------------------
c:0031 p:0003 s:0174 e:000173 METHOD <internal:gc>:213
<internal:gc>:213: [BUG] Segmentation fault at 0x00000f8747236360
ruby 3.1.0dev (2021-05-19T05:24:01Z master adcbae8d49) [powerpc64le-linux]
-- Control frame information -----------------------------------------------
c:0031 p:0003 s:0174 e:000173 METHOD <internal:gc>:213
make: *** [uncommon.mk:802: yes-test-all] Segmentation fault (core dumped)
Files
Updated by xtkoba (Tee KOBAYASHI) over 3 years ago
Some googling told me that the page size defaults to 64k on powerpc64le for some Linux distros. Possibly sysconf(3)
does not report the correct page size?
Updated by jaruga (Jun Aruga) over 3 years ago
I had a resistance to upload the full log check.log, because the file size is big. However I would upload the log file now.
$ du -sh check.log
1.2M check.log
$ wc -l check.log
22963 check.log
Updated by mame (Yusuke Endoh) over 3 years ago
- Related to Bug #17306: TestGCCompact#test_ast_compacts test failures added
Updated by mame (Yusuke Endoh) over 3 years ago
xtkoba (Tee KOBAYASHI) wrote in #note-1:
Possibly
sysconf(3)
does not report the correct page size?
I think it reports 64k correctly.
$ ruby -retc -e'p Etc.sysconf(Etc::SC_PAGE_SIZE)'
65536
The fix for #17306 disabled the auto compaction on a platform whose page size is greater than 4k, like ppc64le. However, TestGCCompact#test_ast_compacts
is a test for (not auto) compaction, so it is still executed on ppc64le. I'm unsure why the fix addressed the issue of #17306.
Updated by jaruga (Jun Aruga) over 3 years ago
Running the each test in test/ruby/test_gc_compact.rb on the above environment, here is the result. The failed test is not only the test_ast_compacts test.
- test_enable_autocompact : ok
- test_disable_autocompact : ok
- test_major_compacts : ok
- test_implicit_compaction_does_something : ok
- test_gc_compact_stats : error
- test_complex_hash_keys : error
- test_ast_compacts : error
- test_compact_count : error
Here are the the commands I executed.
$ make test-all TESTS="-v -n test_enable_autocompact test/ruby/test_gc_compact.rb"
# => ok
$ make test-all TESTS="-v -n test_disable_autocompact test/ruby/test_gc_compact.rb"
# => ok
$ make test-all TESTS="-v -n test_major_compacts test/ruby/test_gc_compact.rb"
# => ok
$ make test-all TESTS="-v -n test_implicit_compaction_does_something test/ruby/test_gc_compact.rb"
# => ok
$ make test-all TESTS="-v -n test_gc_compact_stats test/ruby/test_gc_compact.rb"
# => error (Segmentation fault)
$ make test-all TESTS="-v -n test_complex_hash_keys test/ruby/test_gc_compact.rb"
# => error (Segmentation fault)
$ make test-all TESTS="-v -n test_ast_compacts test/ruby/test_gc_compact.rb"
# => error (<internal:gc>:213: [BUG] Couldn't unprotect page 0x00000216f95c4000)
$ make test-all TESTS="-v -n test_compact_count test/ruby/test_gc_compact.rb"
# => error (Segmentation fault)
Updated by xtkoba (Tee KOBAYASHI) over 3 years ago
mame (Yusuke Endoh) wrote in #note-4:
I'm unsure why the fix addressed the issue of #17306.
This is because after #17306 was fixed the commit 32b7dcfb56a417c1d1c354102351fc1825d653bf changed the behavior of {,un}lock_page_body
so that they call mprotect(2)
for an explicit compaction, regardless of the page size.
Updated by tenderlovemaking (Aaron Patterson) over 3 years ago
xtkoba (Tee KOBAYASHI) wrote in #note-6:
mame (Yusuke Endoh) wrote in #note-4:
I'm unsure why the fix addressed the issue of #17306.
This is because after #17306 was fixed the commit 32b7dcfb56a417c1d1c354102351fc1825d653bf changed the behavior of
{,un}lock_page_body
so that they callmprotect(2)
for an explicit compaction, regardless of the page size.
Ya, that's correct. Even manual compaction requires the mprotect read barrier. Basically we need to disable compaction on platforms that don't support it. I'll make a fix.
Updated by tenderlovemaking (Aaron Patterson) over 3 years ago
- Status changed from Open to Closed
Applied in changeset git|fc832ffbfaf581ff63ef40dc3f4ec5c8ff39aae6.
Disable compaction on platforms that can't support it
Manual compaction also requires a read barrier, so we need to disable
even manual compaction on platforms that don't support mprotect.
[Bug #17871]
Updated by jaruga (Jun Aruga) over 3 years ago
@tenderlovemaking (Aaron Patterson) thanks for fixing it! I removed the skipped gc tests on Travis ppc64le at the af43198738bf45d55d91d7f48b197f94dc526967 .
Updated by wanabe (_ wanabe) over 2 years ago
Note for the backport maintainer:
The issue seems to be still reproduced in 3.0.
http://rubyci.s3.amazonaws.com/ppc64le/ruby-3.0/log/20220430T053940Z.fail.html.gz
Updated by vo.x (Vit Ondruch) over 2 years ago
wanabe (_ wanabe) wrote in #note-10:
Note for the backport maintainer:
The issue seems to be still reproduced in 3.0.
http://rubyci.s3.amazonaws.com/ppc64le/ruby-3.0/log/20220430T053940Z.fail.html.gz
I think that this should be covered by #18746 and the related tickets
Updated by jaruga (Jun Aruga) over 2 years ago
- Backport changed from 2.6: UNKNOWN, 2.7: UNKNOWN, 3.0: UNKNOWN to 2.6: REQUIRED, 2.7: REQUIRED, 3.0: REQUIRED
Right now the patch is only applied to master and Ruby 3.1. I want to see the backport to old Rubies.
https://github.com/ruby/ruby/commit/fc832ffbfaf581ff63ef40dc3f4ec5c8ff39aae6
This issue is related to #18560 .
After applying this patch, gems can use GC.compact
like this way.
begin
GC.compact
rescue NotImplementedError
end