Project

General

Profile

Actions

Bug #21655

open

segfault when building 3.3.10, regression from 3.3.9

Bug #21655: segfault when building 3.3.10, regression from 3.3.9

Added by kurly (Greg Kubaryk) 1 day ago. Updated about 8 hours ago.

Status:
Open
Assignee:
-
Target version:
-
ruby -v:
ruby 3.3.10 (2025-10-23 revision 343ea05002) [x86_64-linux]
[ruby-core:123582]

Description

ref downstream bug https://bugs.gentoo.org/965095 - reporting upstream because I was able to reproduce the problem from ruby-3.3.10.tar.xz manually

build log excerpt; the rest will be provided as an attachment

gcc -O2 -pipe -march=amdfam10  -L. -fstack-protector-strong -rdynamic -Wl,-export-dynamic -fstack-protector-strong -pie  main.o dmydln.o miniinit.o dmyext.o array.o ast.o bignum.o class.o compar.o compile.o complex.o cont.o debug.o debug_counter.o dir.o dln_find.o encoding.o enum.o enumerator.o error.o eval.o file.o gc.o hash.o inits.o io.o io_buffer.o iseq.o load.o marshal.o math.o memory_view.o rjit.o rjit_c.o node.o node_dump.o numeric.o object.o pack.o parse.o parser_st.o proc.o process.o ractor.o random.o range.o rational.o re.o regcomp.o regenc.o regerror.o regexec.o regparse.o regsyntax.o ruby.o ruby_parser.o scheduler.o shape.o signal.o sprintf.o st.o strftime.o string.o struct.o symbol.o thread.o time.o transcode.o util.o variable.o version.o vm.o vm_backtrace.o vm_dump.o vm_sync.o vm_trace.o weakmap.o prism/api_node.o prism/api_pack.o prism/diagnostic.o prism/encoding.o prism/extension.o prism/node.o prism/options.o prism/pack.o prism/prettyprint.o prism/regexp.o prism/serialize.o prism/token_type.o prism/util/pm_buffer.o prism/util/pm_char.o prism/util/pm_constant_pool.o prism/util/pm_list.o prism/util/pm_memchr.o prism/util/pm_newline_list.o prism/util/pm_state_stack.o prism/util/pm_string.o prism/util/pm_string_list.o prism/util/pm_strncasecmp.o prism/util/pm_strpbrk.o prism/prism.o prism_init.o yjit.o yjit/target/release/libyjit.o coroutine/amd64/Context.o  enc/ascii.o enc/us_ascii.o enc/unicode.o enc/utf_8.o enc/trans/newline.o setproctitle.o addr2line.o  -lz -lrt -lrt -lgmp -ldl -lcrypt -lm -lpthread  -o miniruby
:
./miniruby -I./lib -I. -I.ext/common  ./tool/generic_erb.rb -o builtin_binary.inc \
	./template/builtin_binary.inc.tmpl
make: *** [uncommon.mk:1316: builtin_binary.inc] Segmentation fault (core dumped)

Files

buildlog (76.5 KB) buildlog output of: ./configure CFLAGS="-O2 -pipe -march=amdfam10" && make -j8 V=1 kurly (Greg Kubaryk), 10/29/2025 05:33 AM

Updated by kurly (Greg Kubaryk) 1 day ago Actions #1 [ruby-core:123583]

backtrace using a gentoo-built build with -ggdb3 added to CFLAGS

beans ~ # cd /var/tmp/portage/dev-lang/ruby-3.3.10/work/ruby-3.3.10/
beans /var/tmp/portage/dev-lang/ruby-3.3.10/work/ruby-3.3.10 # gdb --args ./miniruby -I./lib -I. -I.ext/common  -n -e 'BEGIN{version=ARGV.shift;mis=ARGV.dup}' -e 'END{abort "UNICODE version mismatch: #{mis}" unless mis.empty?}' -e '(mis.delete(ARGF.path); ARGF.close) if /ONIG_UNICODE_VERSION_STRING +"#{Regexp.quote(version)}"/o' 15.0.0 ./enc/unicode/15.0.0/casefold.h ./enc/unicode/15.0.0/name2ctype.h ./miniruby -I./lib -I. -I.ext/common  ./tool/generic_erb.rb -o builtin_binary.inc ./template/builtin_binary.inc.tmpl
GNU gdb (Gentoo 16.3 vanilla) 16.3
Copyright (C) 2024 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://bugs.gentoo.org/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./miniruby...
warning: File "/var/tmp/portage/dev-lang/ruby-3.3.10/work/ruby-3.3.10/.gdbinit" auto-loading has been declined by your `auto-load safe-path' set to "$debugdir:$datadir/auto-load".
To enable execution of this file add
	add-auto-load-safe-path /var/tmp/portage/dev-lang/ruby-3.3.10/work/ruby-3.3.10/.gdbinit
line to your configuration file "/root/.config/gdb/gdbinit".
To completely disable this security protection add
	set auto-load safe-path /
line to your configuration file "/root/.config/gdb/gdbinit".
For more information about this security protection see the
"Auto-loading safe path" section in the GDB manual.  E.g., run from the shell:
	info "(gdb)Auto-loading safe path"
(gdb) run
Starting program: /var/tmp/portage/dev-lang/ruby-3.3.10/work/ruby-3.3.10/miniruby -I./lib -I. -I.ext/common -n -e BEGIN\{version=ARGV.shift\;mis=ARGV.dup\} -e END\{abort\ \"UNICODE\ version\ mismatch:\ \#\{mis\}\"\ unless\ mis.empty\?\} -e \(mis.delete\(ARGF.path\)\;\ ARGF.close\)\ if\ /ONIG_UNICODE_VERSION_STRING\ +\"\#\{Regexp.quote\(version\)\}\"/o 15.0.0 ./enc/unicode/15.0.0/casefold.h ./enc/unicode/15.0.0/name2ctype.h ./miniruby -I./lib -I. -I.ext/common ./tool/generic_erb.rb -o builtin_binary.inc ./template/builtin_binary.inc.tmpl
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7c76d69 in malloc_usable_size () from /lib64/libc.so.6
(gdb) bt
#0  0x00007ffff7c76d69 in malloc_usable_size () from /lib64/libc.so.6
#1  0x000055555563b490 in objspace_malloc_size (objspace=0x5555559a7550, ptr=0x45005555559e6dc0, hint=<optimized out>)
    at gc.c:12381
#2  objspace_xrealloc (objspace=0x5555559a7550, ptr=0x45005555559e6dc0, new_size=18, old_size=<optimized out>) at gc.c:12689
#3  0x0000555555760a8a in rb_str_resize (str=str@entry=140737348585000, len=17) at string.c:3179
#4  0x0000555555760c2d in rb_fstring (str=str@entry=140737348585000) at ./include/ruby/internal/core/rstring.h:369
#5  0x00005555557a32c8 in build_const_pathname (head=<optimized out>, tail=140737348585040) at variable.c:397
#6  rb_set_class_path_string (klass=klass@entry=140737348547760, under=under@entry=140737348551920, name=140737348585040)
    at variable.c:417
#7  0x000055555559c2f4 in rb_define_class_id_under_no_pin (outer=140737348551920, id=10891, super=140737348552240, 
    super@entry=93824992527378) at class.c:1035
#8  0x000055555559c412 in rb_define_class_id_under (outer=<optimized out>, id=<optimized out>, super=super@entry=93824992527378)
    at class.c:1045
#9  0x000055555559c470 in rb_define_class_under (outer=<optimized out>, name=name@entry=0x555555846fdc "EADDRINUSE", 
    super=93824992527378) at class.c:1006
#10 0x000055555560e153 in set_syserr (n=n@entry=98, name=name@entry=0x555555846fdc "EADDRINUSE") at error.c:2711
#11 0x00005555556148e9 in Init_syserr () at /var/tmp/portage/dev-lang/ruby-3.3.10/work/ruby-3.3.10/known_errors.inc:18
#12 0x0000555555647eaa in rb_call_inits () at inits.c:40
#13 0x0000555555618943 in ruby_setup () at eval.c:89
#14 0x000055555561a40d in ruby_init () at eval.c:101
#15 0x0000555555576193 in rb_main (argc=22, argv=0x7fffffffe098) at ./main.c:38
#16 main (argc=<optimized out>, argv=<optimized out>) at ./main.c:64
(gdb) frame 3
#3  0x0000555555760a8a in rb_str_resize (str=str@entry=140737348585000, len=17) at string.c:3179
3179	            SIZED_REALLOC_N(RSTRING(str)->as.heap.ptr, char,
(gdb) p str
$1 = 140737348585000
(gdb) p *str
$2 = 8396805
(gdb) 

Updated by hsbt (Hiroshi SHIBATA) 1 day ago Actions #2

  • Description updated (diff)

Updated by kurly (Greg Kubaryk) 1 day ago Actions #3 [ruby-core:123584]

Thank you for fixing the markdown in the comment 0.

On an affected machine, I was able to bisect the git repo between tags v3_3_9 and v3_3_10:

5a8d7642168f4ea0d9331fded3033c225bbc36c5 is the first bad commit
commit 5a8d7642168f4ea0d9331fded3033c225bbc36c5 (HEAD)
Author:     nagachika <nagachika@ruby-lang.org>
AuthorDate: Wed Oct 8 22:55:33 2025 +0900
Commit:     nagachika <nagachika@ruby-lang.org>
CommitDate: Wed Oct 8 22:56:02 2025 +0900

    merge revision(s) 43dbb9a93f4de3f1170d7d18641c30e81cc08365, 2bb6fe3854e2a4854bb89bfce4eaaea9d848fd1b, 7c9dd0ecff61153b96473c6c51d5582e809da489: [Backport #21629]
    
            [PATCH] [Bug #21629] Enable `nonstring` attribute on clang 21
    
            [PATCH] [Bug #21629] Initialize `struct RString`
    
            [PATCH] [Bug #21629] Initialize `struct RArray`

 error.c                                | 2 +-
 ext/-test-/string/fstring.c            | 2 +-
 include/ruby/internal/attr/nonstring.h | 8 ++++++++
 include/ruby/internal/core/rbasic.h    | 3 +++
 include/ruby/internal/core/rstring.h   | 2 +-
 load.c                                 | 4 ++--
 marshal.c                              | 2 +-
 string.c                               | 8 ++++----
 symbol.c                               | 8 ++++----
 version.h                              | 2 +-
 10 files changed, 26 insertions(+), 15 deletions(-)

I was not able to reproduce the build failure for ruby 3.3.10 on an Ubuntu 24.04 machine using gcc-13.3.0.

Updated by kurly (Greg Kubaryk) 1 day ago Actions #4 [ruby-core:123585]

I manually bisected inside that "bad" commit and found that this minimal diff on top of v3_3_10 eliminates the build failure:

diff --git a/include/ruby/internal/core/rstring.h b/include/ruby/internal/core/rstring.h
index 9cf9daa97c..0bca74e688 100644
--- a/include/ruby/internal/core/rstring.h
+++ b/include/ruby/internal/core/rstring.h
@@ -395,7 +395,7 @@ rbimpl_rstring_getmem(VALUE str)
     }
     else {
         /* Expecting compilers to optimize this on-stack struct away. */
-        struct RString retval = {RBASIC_INIT};
+        struct RString retval;
         retval.len = RSTRING_LEN(str);
         retval.as.heap.ptr = RSTRING(str)->as.embed.ary;
         return retval;

Updated by alanwu (Alan Wu) about 20 hours ago · Edited Actions #5 [ruby-core:123604]

It's surprising that leaving the temporary struct uninitialized avoids the crash. Smells like a GCC bug or some UB on our end the optimizer is exploiting.

Does ./configure optflags=-fno-strict-aliasing ... help?

Updated by kurly (Greg Kubaryk) about 18 hours ago Actions #6 [ruby-core:123605]

alanwu (Alan Wu) wrote in #note-5:

It's surprising that leaving the temporary struct uninitialized avoids the crash. Smells like a GCC bug or some UB on our end the optimizer is exploiting.

Does ./configure optflags=-fno-strict-aliasing ... help?

It doesn't appear to, when added to CFLAGS nor optflags.

Updated by kurly (Greg Kubaryk) about 18 hours ago Actions #7 [ruby-core:123606]

alanwu (Alan Wu) wrote in #note-5:

It's surprising that leaving the temporary struct uninitialized avoids the crash. Smells like a GCC bug or some UB on our end the optimizer is exploiting.

Does ./configure optflags=-fno-strict-aliasing ... help?

Per a suggestion on the downstream bug, I tried adding -fno-ipa-modref to CFLAGS, and that was sufficient to fix the build.

Updated by alanwu (Alan Wu) about 17 hours ago Actions #8 [ruby-core:123607]

Looks like you're not building with LTO, so the miscomp from ipa-modref should be in rb_str_resize(). That should be enough for a bug report for GCC, since they need a preprocessed C file.

Maybe this is hitting the same GCC bug as this: https://patchwork.sourceware.org/project/gdb/patch/20250712131649.8372-1-tdevries@suse.de/#206552 which -fno-ipa-modref also fixes. Unfortunately the bug on GCC side is still unresolved: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120987

Updated by thesamesam (Sam James) about 8 hours ago · Edited Actions #9 [ruby-core:123609]

alanwu (Alan Wu) wrote in #note-8:

Looks like you're not building with LTO, so the miscomp from ipa-modref should be in rb_str_resize(). That should be enough for a bug report for GCC, since they need a preprocessed C file.

Maybe this is hitting the same GCC bug as this: https://patchwork.sourceware.org/project/gdb/patch/20250712131649.8372-1-tdevries@suse.de/#206552 which -fno-ipa-modref also fixes. Unfortunately the bug on GCC side is still unresolved: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120987

That bug has a patch which should workaround (or fix, it's not clear) that specific issue, but I asked OP to test that and it didn't help this bug. But it may (or may not) still be an issue in modref, just probably not that bug.

Actions

Also available in: PDF Atom