Project

General

Profile

Bug #15763

Segmentation fault in timeout.rb / sleep

Added by stan-envato (Stan Pitucha) 4 months ago. Updated 2 months ago.

Status:
Open
Priority:
Normal
Assignee:
-
Target version:
-
ruby -v:
ruby 2.6.2p47 (2019-03-13 revision 67232) [x86_64-darwin18]
[ruby-core:92239]

Description

I'm running into crashes on both ruby 2.6.1 and 2.6.2 (2.5.x is all good).
I'm on OSX / mojave with ruby installed via rbenv / ruby-build. Confirmed on two different machines.

The crash happens through the parallel gem, but it happens even if the number of processes is reduced to 1.

Short summary:

-- Control frame information -----------------------------------------------
c:0003 p:---- s:0011 e:000010 CFUNC :sleep
c:0002 p:0025 s:0006 e:000005 BLOCK /Users/viraptor/.rbenv/versions/2.6.2/lib/ruby/2.6.0/timeout.rb:86 [FINISH]
c:0001 p:---- s:0003 e:000002 (none) [FINISH]

-- Ruby level backtrace information ----------------------------------------
/Users/viraptor/.rbenv/versions/2.6.2/lib/ruby/2.6.0/timeout.rb:86:in block (2 levels) in timeout'
/Users/viraptor/.rbenv/versions/2.6.2/lib/ruby/2.6.0/timeout.rb:86:in
sleep'

The rest is in the logs.


Files

crash_log (68.9 KB) crash_log console output stan-envato (Stan Pitucha), 04/11/2019 12:22 AM
ruby_2019-04-11-101832-3_Stans-MacBook-Pro.crash (44.7 KB) ruby_2019-04-11-101832-3_Stans-MacBook-Pro.crash osx crash report stan-envato (Stan Pitucha), 04/11/2019 12:25 AM

Related issues

Related to Ruby master - Bug #13646: Segmentation fault with postgresql_adapter in RailsOpenActions

History

Updated by stan-envato (Stan Pitucha) 4 months ago

Additionally, the issue does not seem to happen on every build. If I rebuild the same version of ruby, the issue may go away. (until another few rebuilds)

Updated by mame (Yusuke Endoh) 4 months ago

This might be the same issue as:

The common points are:

  • macOS (darwin17 or 18)
  • uses multiple threads
  • segfault in getaddrinfo

I could be wrong, but I suspect a bug of macOS's getaddrinfo.

Can you show a short program that causes the segfault?

Updated by alexagranov (Alex Agranov) 3 months ago

I came here after seeing the same segfault in timeout.rb / CFUNC :sleep on ruby 2.6.2 on MacOS with a Rails project running with Puma and 2 worker threads.

Installed 2.6.3 and now seeing the segfault coming from pg - but interestingly while opening a connection to the db:

-- C level backtrace information -------------------------------------------
/Users/agranov/.rvm/rubies/ruby-2.6.3/lib/libruby.2.6.dylib(rb_vm_bugreport+0x82) [0x10bd87182]
/Users/agranov/.rvm/rubies/ruby-2.6.3/lib/libruby.2.6.dylib(rb_bug_context+0x1d3) [0x10bbd31f3]
/Users/agranov/.rvm/rubies/ruby-2.6.3/lib/libruby.2.6.dylib(sigsegv+0x51) [0x10bceb591]
/usr/lib/system/libsystem_platform.dylib(_sigtramp+0x1d) [0x7fff5827db5d]
/usr/lib/system/libsystem_trace.dylib(_os_log_preferences_refresh+0x4c) [0x7fff582a090a]
/usr/lib/system/libsystem_trace.dylib(0x7fff582a113d) [0x7fff582a113d]
/usr/lib/system/libsystem_info.dylib(si_destination_compare_statistics+0x903) [0x7fff581b9843]
/usr/lib/system/libsystem_info.dylib(0x7fff581b81a5) [0x7fff581b81a5]
/usr/lib/system/libsystem_info.dylib(0x7fff581b7d3f) [0x7fff581b7d3f]
/usr/lib/system/libsystem_info.dylib(0x7fff581966df) [0x7fff581966df]
/usr/lib/system/libsystem_c.dylib(_isort+0xc1) [0x7fff58140e5b]
/usr/lib/system/libsystem_c.dylib(0x7fff58140d88) [0x7fff58140d88]
/usr/lib/system/libsystem_info.dylib(0x7fff5818df2d) [0x7fff5818df2d]
/usr/lib/system/libsystem_info.dylib(0x7fff5818c885) [0x7fff5818c885]
/usr/lib/system/libsystem_info.dylib(0x7fff5818bf77) [0x7fff5818bf77]
/usr/lib/system/libsystem_info.dylib(0x7fff5818be7d) [0x7fff5818be7d]
/usr/lib/libpq.5.dylib(connectDBStart+0x1d4) [0x7fff57094af2]
/usr/lib/libpq.5.dylib(PQconnectStart+0x3a) [0x7fff570941de]
/usr/lib/libpq.5.dylib(PQconnectdb+0xb) [0x7fff57094181]

Reducing the Puma workers to a single one, I've yet to see a segfault.

Updated by alexagranov (Alex Agranov) 3 months ago

Nix that: single Puma worker makes no difference. Back to segfault in timeout.rb.

#5

Updated by jeremyevans0 (Jeremy Evans) 3 months ago

  • Related to Bug #13646: Segmentation fault with postgresql_adapter in Rails added

Updated by jeremyevans0 (Jeremy Evans) 3 months ago

I think mame is correct that this is related to Mac OS X getaddrinfo. We have at least 5 separate bug reports for very similar issues. All segmentation faults with similar addresses, all on Mac OS X and either definitely or probably inside getaddrinfo:

  • #15763: 0x00000001081bfa52 (definitely in getaddrinfo, this issue)
  • #15490: 0x000000010f7e1a3a (definitely in getaddrinfo, during ssh connection)
  • #15639: 0x000000010e82ca3a (definitely in getaddrinfo, during postgresql connection)
  • #15749: 0x000000010d9bda7c (definitely in getaddrinfo, during postgresql connection)
  • #13646: 0x000000010abfaa3a (probably in getaddrinfo, during postgresql connection)

In most of these cases, getaddrinfo isn't even called directly by Ruby, it is called by C code (e.g. libpq). I'm not sure Third Party's Issue is appropriate for these issues, but I'm not sure there is anything we can do to fix it.

Updated by alexagranov (Alex Agranov) 2 months ago

A valid workaround until this is fixed in MacOS - if you can get away without ipv6 - is to have your web server like Puma bind to an ipv4 address like -b 127.0.0.1 or -b 0.0.0.0 upon boot and then all is :rainbows:.

Also available in: Atom PDF