Project

General

Profile

Actions

Bug #8100

closed

Segfault in trunk

Bug #8100: Segfault in trunk

Added by judofyr (Magnus Holm) over 12 years ago. Updated over 12 years ago.

Status:
Closed
Target version:
ruby -v:
ruby 2.1.0dev (2013-03-18 trunk 39805) [x86_64-linux]
Backport:
[ruby-core:53439]

Description

=begin
Full backtrace (both VM, C and Ruby) is both attached and available at https://travis-ci.org/rtomayko/tilt/jobs/5479138

I haven't been able to reproduce it (and thus I can't create a reduced test case).

This is the test that fails: https://github.com/rtomayko/tilt/blob/581230cbb3b314e88cf5ec9167a24ebb8acc7a93/test/tilt_compilesite_test.rb#L31

The code in question will do these steps in several threads at the same time:

The method is doing some funky class << self to ensure that it gets evaluated under a proper constant scope). It's also caching the methods, so it won't always define a new method, but might re-use another UnboundMethod from a previous compilation (that might have happened on a different thread).

I know it's not much to go after, but at least the backtrace seems to suggest that the error happend in rb_ary_fill in array.c.

I've also had another report of segfault in Tilt + Ruby 2.0.0, but I don't have the full backtrace yet: https://github.com/rtomayko/tilt/issues/179. Might this be related?

Let me know if you need more details.
=end


Files

seglog.txt (104 KB) seglog.txt judofyr (Magnus Holm), 03/15/2013 08:58 PM
segfault_spec.tar.gz (3.01 KB) segfault_spec.tar.gz zzak (zzak _), 03/18/2013 10:51 AM
seg.txt (63.4 KB) seg.txt DAddYE (Davide D'Agostino), 03/18/2013 04:14 PM
fail.rb (604 Bytes) fail.rb Reduced script judofyr (Magnus Holm), 03/22/2013 06:38 PM

Related issues 3 (0 open3 closed)

Has duplicate Ruby - Bug #8336: Segfault in :=~ClosedActions
Has duplicate Ruby - Bug #8353: segfault with puma-1.6.3ClosedActions
Has duplicate Ruby - Bug #8056: Random segmentation faults in TempfileClosedActions

Updated by zzak (zzak _) over 12 years ago Actions #1 [ruby-core:53489]

  • File segfault_spec.tar.gz segfault_spec.tar.gz added
  • Subject changed from Segfault in ruby-2.0.0p0 to Segfault in trunk
  • Target version set to 2.1.0
  • ruby -v changed from ruby 2.0.0p0 (2013-02-24 revision 39474) [x86_64-linux] to ruby 2.1.0dev (2013-03-18 trunk 39805) [x86_64-linux]

I've updated the description of this ticket, because I'm able to reproduce a similar bug. Only similar in that we're using a lot of the same dependencies.

I also went ahead and created (as small as possible) reproducible script. Here's the instructions for reproducing the segfault:

  1. git clone git://github.com/zzak/segfault_spec.rb.git
  2. bundle install
  3. bundle exec rspec segfault_spec.rb
  4. repeat #3 until segfault. this may take a few tries

I will also attach an archive of the script.

Updated by zzak (zzak _) over 12 years ago Actions #3 [ruby-core:53508]

Forgot to add a link to the repo on github: https://github.com/zzak/segfault_spec.rb

Updated by wardrop (Tom Wardrop) over 12 years ago Actions #4 [ruby-core:53532]

I'm also getting segfaults on Ruby 2.0.0. It seems to be related to threading or forking. Can't quite put my figure on it. All I can say is that I don't get in when running my web app in WEBrick on my Mac, but if running it on my CentOS server with Phusion Passenger using the smart spawn method, I get it all the time, about every 10th request it segfaults. Setting passenger to a conservative spawn method (one request per process) reduces the segfault rate considerably, but they still occur.

Here's a stack overflow thread about it, with a response I left on there with a bit more information about my experiences: http://stackoverflow.com/questions/15315809/segfault-error-in-sinatra-after-upgrading-to-ruby-2-0-beta/15492401#15492401

I also reported this to the Phusion Passenger Google Group before realising it's a problem with ruby 2.0.0: https://groups.google.com/forum/?fromgroups=#!topic/phusion-passenger/iEOE4shl_jE

Here's a log including numerous segfaults from my CentOS server running Phusion Passenger: https://gist.github.com/Wardrop/5179380

Either way, it looks like something common to web applications is causing this, or perhaps web application frameworks are so far the most common cases in which Ruby 2.0.0 is being used.

Updated by judofyr (Magnus Holm) over 12 years ago Actions #5 [ruby-core:53631]

I've managed to reduce the script down to 30 lines (with no dependencies) that segfaults in both 2.0.0-p0 and trunk (39875). It doesn't segfault every time though so if it takes more than a few seconds to run it, simply Ctrl-C and try again.

Updated by judofyr (Magnus Holm) over 12 years ago Actions #6 [ruby-core:53633]

Here's a backtrace I got in gdb: http://pastie.org/7064676. rb_gc_mark_unlinked_live_method_entries seems suspicious and related to what the script does.

Updated by wardrop (Tom Wardrop) over 12 years ago Actions #7 [ruby-core:53634]

They've obviously done work on the garbage collector for Ruby 2.0. This is likely a bug introduced as result of that. Good work tracking it down judofyr.

Updated by judofyr (Magnus Holm) over 12 years ago Actions #8 [ruby-core:53636]

After working with charliesome we've now found an even simpler test case:

http://eval.in/13339

This always segfaults for me on trunk.

Updated by Anonymous over 12 years ago Actions #9 [ruby-core:53640]

=begin
Magnus and I reduced this down to an even simpler^2 test case:

loop do
def x
"hello" * 1000
end

method(:x).call

end

http://eval.in/13344
=end

Updated by kosaki (Motohiro KOSAKI) over 12 years ago Actions #10 [ruby-core:53643]

  • Category set to core
  • Status changed from Open to Assigned
  • Assignee set to authorNari (Narihiro Nakamura)

Updated by nobu (Nobuyoshi Nakada) over 12 years ago Actions #11

  • Status changed from Assigned to Closed
  • % Done changed from 0 to 100

This issue was solved with changeset r39883.
Magnus, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.


  • KNOWNBUGS.rb: test for [Bug #8100].

Updated by nobu (Nobuyoshi Nakada) over 12 years ago Actions #12 [ruby-core:53666]

  • Status changed from Closed to Assigned
  • % Done changed from 100 to 0

Updated by Anonymous over 12 years ago Actions #13 [ruby-core:53668]

nobu-san, this will loop forever when the bug is fixed. Perhaps change it to 100_000.times?

Updated by wardrop (Tom Wardrop) over 12 years ago Actions #14 [ruby-core:53670]

I'd set it to a duration rather than a set number of iterations. I've see it go for 2 seconds on my machine before segfault'ing. 3 seconds should fail almost every time.

start_time = Time.now
while (Time.now - start_time) < 3
  def x
    "hello" * 1000
  end
  method(:x).call
end

Updated by nobu (Nobuyoshi Nakada) over 12 years ago Actions #15 [ruby-core:53674]

charliesome (Charlie Somerville) wrote:

nobu-san, this will loop forever when the bug is fixed. Perhaps change it to 100_000.times?

Sure, I've forgot it before the commit.

Updated by naruse (Yui NARUSE) over 12 years ago Actions #16

  • Status changed from Assigned to Closed
  • % Done changed from 0 to 100

This issue was solved with changeset r39894.
Magnus, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.


Add timeout to infinite loop [Bug #8100]

On FreeBSD, it doesn't SEGV.
http://fbsd.rubyci.org/~chkbuild/ruby-trunk/log/20130323T170203Z.log.html.gz

Updated by naruse (Yui NARUSE) over 12 years ago Actions #17 [ruby-core:53681]

  • Status changed from Closed to Assigned

Updated by authorNari (Narihiro Nakamura) over 12 years ago Actions #18

  • Status changed from Assigned to Closed

This issue was solved with changeset r39919.
Magnus, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.


  • proc.c (bm_free): need to clean up the mark flag of a free and
    unlinked method entry. [Bug #8100] [ruby-core:53439]

Updated by zzak (zzak _) over 12 years ago Actions #19 [ruby-core:53702]

Thank you nari-san and everyone who helped with this.

Should this be backported as well?

Updated by authorNari (Narihiro Nakamura) over 12 years ago Actions #20 [ruby-core:53708]

zzak (Zachary Scott) wrote:

Thank you nari-san and everyone who helped with this.

Should this be backported as well?

Yeah, this fix should be backport to 1.9.3 and 2.0.0.

Updated by wardrop (Tom Wardrop) over 12 years ago Actions #21 [ruby-core:53715]

Eagerly awaiting the backport. Can someone please leave a comment when it's back-ported to ruby-2.0.0 head?

Updated by authorNari (Narihiro Nakamura) over 12 years ago Actions #22 [ruby-core:53718]

wardrop (Tom Wardrop) wrote:

Eagerly awaiting the backport. Can someone please leave a comment when it's back-ported to ruby-2.0.0 head?

The backport request ticket is here.
https://bugs.ruby-lang.org/issues/8163
You might want to watch this ticket for your purpose.

Updated by wardrop (Tom Wardrop) over 12 years ago Actions #23 [ruby-core:53721]

Thanks for that. By the way, I've applied the patch to my production server. Write me down as another happy customer :-)

Actions

Also available in: PDF Atom