Project

General

Profile

Actions

Bug #19606

closed

addr2line.c broken on Fedora 38

Added by kjtsanaktsidis (KJ Tsanaktsidis) about 1 year ago. Updated 12 months ago.

Status:
Closed
Assignee:
-
Target version:
-
[ruby-core:113283]

Description

I'm running the Fedora 38 beta on my machine, and the Ruby crash reporter is itself crashing while trying to print C-level backtraces, with an error like this:

-- C level backtrace information -------------------------------------------
1344: Abbrev Number 27547 not found

This seems to happen because the debuginfo provided by Fedora 38 (for e.g. the libc frames) is compressed with dwz. Some debginfo is shared between multiple files, and a reference to the shared file is put in the .gnu_debugaltlink attribute of the debug object. Then, the main debug object contains forms of type DW_FORM_GNU_ref_alt and DW_FORM_GNU_strp_alt, whose values will refer to offsets inside the debugaltlink file.

For example, libc symbols contain a .gnu_debugaltlink section

% readelf -x .gnu_debugaltlink /usr/lib/debug/lib64/libc.so.6-2.37-1.fc38.x86_64.debug

Hex dump of section '.gnu_debugaltlink':
  0x00000000 2e2e2f2e 64777a2f 676c6962 632d322e ../.dwz/glibc-2.
  0x00000010 33372d31 2e666333 382e7838 365f3634 37-1.fc38.x86_64
  0x00000020 002a30a8 c71ff2fd aa58a637 226c0f56 .*0......X.7"l.V
  0x00000030 3ffd674a de                         ?.gJ.

Some DWARF abbrev's contain references to this shared data:

% readelf  --debug-dump /usr/lib/debug/lib64/libc.so.6-2.37-1.fc38.x86_64.debug
...
Contents of the .debug_abbrev section (loaded from /usr/lib/debug/lib64/libc.so.6-2.37-1.fc38.x86_64.debug):
...
   50      DW_TAG_typedef    [no children]
    DW_AT_name         DW_FORM_GNU_strp_alt
    DW_AT_decl_file    DW_FORM_data1
    DW_AT_decl_line    DW_FORM_data1
    DW_AT_decl_column  DW_FORM_data1
    DW_AT_type         DW_FORM_GNU_ref_alt
    DW_AT value: 0     DW_FORM value: 0
...

Because addr2line.c doesn't know how to read these DW_FORM_GNU_* forms in debug_info_reader_read_value, it doesn't advance the reader at all and so the DWARF parsing becomes confused. This can lead to a garbage abbrev number being read, or even to just straight up segfaults.

I have a patch which simply skips over the right number of bytes for these forms, without actually reading any of the data: https://github.com/ruby/ruby/pull/7731. This was sufficient to make the crash reporter work again properly on my machine.

Technically speaking, perhaps we should actually open the .gnu_debugaltlink file and dig out the referenced attribute value - this would be required if filename strings were put into the shared dwz info with DW_FORM_GNU_strp_apt. If you think this is necessary I can try and put together a follow up patch to do this. However I've not actually seen a debuginfo file that does this, yet.

Updated by kjtsanaktsidis (KJ Tsanaktsidis) 12 months ago

This is also causing test failures on my machine, because there are tests on the output of the bug reporter - e.g.


[17591/23407] TestBugReporter#test_bug_reporter_add = 0.15 s
  4) Failure:
TestBugReporter#test_bug_reporter_add [/home/kj/ruby/test/-ext-/bug_reporter/test_bug_reporter.rb:31]:
pid 169551 exit 1
| -:1: [BUG] Segmentation fault at 0x000003e80002964f
| ruby 3.3.0dev (2023-04-24T03:48:15Z master 886986b3ef) [x86_64-linux]
|
| -- Control frame information -----------------------------------------------
| c:0003 p:---- s:0012 e:000011 CFUNC  :kill
| c:0002 p:0022 s:0006 e:000005 EVAL   -:1 [FINISH]
| c:0001 p:0000 s:0003 E:001100 DUMMY  [FINISH]
|
| -- Ruby level backtrace information ----------------------------------------
| -:1:in `<main>'
| -:1:in `kill'
|
| -- Threading information ---------------------------------------------------
| Total ractor count: 1
| Ruby thread count for this ractor: 1
|
| -- Machine register context ------------------------------------------------
|  RIP: 0x00007f0793a6adab RBP: 0x000000000000000b RSP: 0x00007ffe339ef5b8
|  RAX: 0x0000000000000000 RBX: 0x0000000000000001 RCX: 0x00007f0793a6adab
|  RDX: 0x000000000002964f RDI: 0x000000000002964f RSI: 0x000000000000000b
|   R8: 0x0000000000000000  R9: 0x0000000000000000 R10: 0x00007f0793a434e8
|  R11: 0x0000000000000206 R12: 0x0000000000000002 R13: 0x00007f0793927048
|  R14: 0x000000000002964f R15: 0x0000000000000001 EFL: 0x0000000000000206
|
| -- C level backtrace information -------------------------------------------
| 1344: Abbrev Number 27547 not found
.

1. [2/2] Assertion for "stderr"
   | Expected /Sample bug reporter: 12345/
   | to match
   |   "-- Control frame information -----------------------------------------------\n"+
   |   "c:0003 p:---- s:0012 e:000011 CFUNC  :kill\n"+
   |   "c:0002 p:0022 s:0006 e:000005 EVAL   -:1 [FINISH]\n"+
   |   "c:0001 p:0000 s:0003 E:001100 DUMMY  [FINISH]\n\n"+
   |   "-- Ruby level backtrace information ----------------------------------------\n"+
   |   "-:1:in `<main>'\n"+
   |   "-:1:in `kill'\n\n"+
   |   "-- Threading information ---------------------------------------------------\n"+
   |   "Total ractor count: 1\n"+
   |   "Ruby thread count for this ractor: 1\n\n"+
   |   "-- Machine register context ------------------------------------------------\n"+
   |   " RIP: 0x00007f0793a6adab RBP: 0x000000000000000b RSP: 0x00007ffe339ef5b8\n"+
   |   " RAX: 0x0000000000000000 RBX: 0x0000000000000001 RCX: 0x00007f0793a6adab\n"+
   |   " RDX: 0x000000000002964f RDI: 0x000000000002964f RSI: 0x000000000000000b\n"+
   |   "  R8: 0x0000000000000000  R9: 0x0000000000000000 R10: 0x00007f0793a434e8\n"+
   |   " R11: 0x0000000000000206 R12: 0x0000000000000002 R13: 0x00007f0793927048\n"+
   |   " R14: 0x000000000002964f R15: 0x0000000000000001 EFL: 0x0000000000000206\n\n"+
   |   "-- C level backtrace information -------------------------------------------\n"+
   |   "1344: Abbrev Number 27547 not found\n"
   | after 4 patterns with 123 characters.
Actions #2

Updated by alanwu (Alan Wu) 12 months ago

  • Status changed from Open to Closed
Actions

Also available in: Atom PDF

Like0
Like0Like0