Project

General

Profile

Actions

Bug #20001

closed

Make Ruby work properly with ASAN enabled

Added by kjtsanaktsidis (KJ Tsanaktsidis) 6 months ago. Updated 4 months ago.

Status:
Closed
Assignee:
-
Target version:
-
[ruby-core:115346]

Description

This ticket covers some work I want to do to get the ASAN infrastructure working in Ruby working again. I don't know if it ever worked well, but if it did, it appears to have bitrotted. Here are a few of its current problems:

Stack size calculations are wrong

Ruby takes the address of a local variable as part of the process of working out the top/bottom of the native stack in native_thread_init_stack. Because ASAN can end up putting some local variables on a "fake stack", this calculation can wind up producing the wrong result and setting th->ec->machine.stack_start incorrectly. This then leads to stack_check thinking that the machine stack has overflowed all the time, and thus, leading to programs like the following to fail:

ASAN_OPTIONS=use_sigaltstack=0:detect_leaks=0 ./miniruby -e 'Thread.new { puts "hi" }.join'
#<Thread:0x00007fb5d79f3f28 -e:1 run> terminated with exception (report_on_exception is true):
SystemStackError
-e: stack level too deep (SystemStackError)

Another consequence of stack size detection not working properly is that the machine stack is not properly marked during GC, so things on the stack which should be considered live get prematurely collected.

ASAN provides the __asan_addr_is_in_fake_stack function which can be used to get the address of a local variable on the real stack; I think Ruby's various stack-detecting macros could then make use of this to make it work.

VALUEs in fake stacks are not marked

Another consequece of ASAN storing local variables in fake stacks is that we don't see them when doing the machine stack mark. Again, the __asan_addr_is_in_fake_stack function can help us here. ASAN leaves a pointer to the fake stack on the real stack in every frame. When marking the machine stack, we can check each word to see if it's a pointer to a fake stack frame, and then use __asan_addr_is_in_fake_stack to get the extents of the fake frame and scan that too.

This seems to be e.g. how V8 does it

Doesn't work with GCC

Our ASAN implementation doesn't work with GCC, even though GCC supports ASAN. This is because we use the __has_feature(address_sanitizer) macro in sanitizers.h, which is a clang-ism. The equivalent GCCism is __SANITIZE_ADDRESS__ and we should check that too.

Plan of attack

At the moment, I can't even run a full build of ruby to run, because miniruby crashees during the build process. My plan of attack here is to:

  • Address those known problems I've already identified above
  • Get make to actually work with asan
  • Try running the test suite through ASAN, and fix any issues that turns up
  • I'm thinking we should add an --enable-asan or --enable-address-sanitizer or some such to our configure script, to make it easy to build Ruby with ASAN without having to poke around with individual CFLAGS/LDFLAGS
  • Eventually, it would be great to actually run the tests under ASAN in CI

This is probably a medium term body of work, but I'll try and tackle it in bits.

Also: @HParker (Adam Hess) and @peterzhu2118 (Peter Zhu) - I know you folks have been working on getting Valgrind to work better with Ruby, for leak detection. I think I see my efforts here as complementary to yours, rather than duplicative. The ASAN infrastructure for poisoning/unpoisoning stuff in the GC already exists and is close to working properly, and it really did help me solve a bug yesterday (https://bugs.ruby-lang.org/issues/19994), so it seems useful and should be made to work. Your work on freeing memory on shutdown (https://bugs.ruby-lang.org/issues/19993) should actually help ASAN usefully detect leaks as well. I think ASAN might be better for eventually running CI checks against the full Ruby test suite, since allegedly it's faster. However, if you think solving these issues with ASAN is a waste of time and Valgrind can catch the same bugs already, please chime in!


Related issues 1 (1 open0 closed)

Related to Ruby master - Misc #20387: Meta-ticket for ASAN supportAssignedkjtsanaktsidis (KJ Tsanaktsidis)Actions
Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0