Bug #2295
closedsegmentation faults
Description
=begin
My server crashes 3-4 times a day with segmentation faults. the fault is not always a same spot, so the attached is just an example of one occurrence. I will keep adding more reports as they come along.
The application is a ruby in rails stack on top of mysql.
Ruby compiled from source on fedora 8 running on EC2.
=end
Files
Updated by tomer.doron (tomer doron) about 15 years ago
- File segfault2.txt segfault2.txt added
=begin
another example
=end
Updated by tomer.doron (tomer doron) about 15 years ago
- File segfault3.txt segfault3.txt added
=begin
same as attachment 1, so you can ignore this one
=end
Updated by tomer.doron (tomer doron) about 15 years ago
- File segfault3.txt segfault3.txt added
=begin
another example
=end
Updated by tomer.doron (tomer doron) about 15 years ago
=begin
looks similar to what is described in #2019
=end
Updated by nobu (Nobuyoshi Nakada) about 15 years ago
- Status changed from Open to Feedback
=begin
Which of MySQL/Ruby or Ruby/MySQL, and what version of it and Rails?
=end
Updated by tomer.doron (tomer doron) about 15 years ago
=begin
rails 2.3.4
mysql gem 2.8.1
=end
Updated by tomer.doron (tomer doron) about 15 years ago
- File related_error.txt related_error.txt added
=begin
another c level exception that shows up in log and may be related
=end
Updated by marcandre (Marc-Andre Lafortune) about 15 years ago
- Status changed from Feedback to Open
=begin
=end
Updated by tomer.doron (tomer doron) about 15 years ago
- File segfault4.txt segfault4.txt added
=begin
and one last example for today :)
=end
Updated by rogerdpack (Roger Pack) about 15 years ago
=begin
any way to reliably recreate? Maybe try it with the trunk version of 1.9.1?
GL.
-r
=end
Updated by tomer.doron (tomer doron) about 15 years ago
=begin
while this is happening on production several times a day, there is not a specific scenario i can refer you to. it is more random in nature like memory issues tend to be. so, while I cant say it is easily recreatable, it is most certainly reliably recreatable... I can keep uploading the crash reports as they come through.
few data points:
- this is running on fedora 8 VM in amazon EC2
- ruby compiled from source.
=end
Updated by rogerdpack (Roger Pack) about 15 years ago
=begin
Have you tried it on other servers? under valgrind?
Can you create unit tests that thrash the system until it occurs? (No reproducibility...hard).
-r
=end
Updated by tomer.doron (tomer doron) about 15 years ago
=begin
Same exact applications was deployed on a different production environment using OSX server with ruby 1.9.1 from MacPorts, I did not experience these issues in that environment. As I moved my servers to amazon EC2 I redeployed on amazon's Fedora 8 core VMs and compiled ruby 1.9.1 from source. At first everything seemed fine, but then as users started pounding the system these crashes started to appear, as this is a production environment I could not stay with it crashing all the time and I rebuilt the server with ruby 1.8 and the issues are gone.
I still have the 1.91. installation available, but without stress it would be more difficult to recreate. I can certainly install valgrind and do some further debugging. will update with info once I have some.
=end
Updated by jarrednicholls (Jarred Nicholls) about 15 years ago
=begin
tomer,
I ran into seg faults on my EC2 boxes with 1.9.1 as well. If your issue is that same as mine, you need to change your kernel and ramdisk for your EC2 instance from the default (which I believe is 2.6.21) to a different version (i.e., 2.6.18). Amazon is doing updates to fix their 2.6.21 and their Ubuntu Server 2.6.31 (beta) kernel. But, if you load the 2.6.18 kernel and associated ramdisk, you might not have those problems anymore. However, my issue literally crashed the kernel/instance and forced a reboot of the instance. But, you can at least give it a shot.
Kernel ID: aki-f5c1219c
Ramdisk ID: ari-dbc121b2
I wish you luck. For me, I simply went back to using Passenger with Ruby EE (which is 1.8.7), and did a patch to the Passenger gem to do asynchronous request dispatching to allow for long running I/O requests to run in parallel so they wouldn't block short quick requests from getting through. I'll move to 1.9 when everything is more stable, and/or go to JRuby if I need true & fast multi-threading. I tried JRuby and was having issues with Cookie Store and decided its memory footprint was way too big for the time being.
Jarred
=end
Updated by tomer.doron (tomer doron) almost 15 years ago
=begin
I have set up a new EC2 instance with the suggested kernel, but i am afraid that has not solved the issue and I am still getting similar segmentation faults. I will have to roll back again to 1.8 until this is resolved.
Because the segmentation faults are inconsistent and not unique to one area of the code, I assume this is somewhere deeper in the EC2 kernel or core ruby code or (and more likely) in the combination of 1.9 and EC2 kernel. if anyone has good suggestions as to how to trace the root cause, I am willing to invest the time to do the research.
=end
Updated by tomer.doron (tomer doron) almost 15 years ago
=begin
attempted upgrading to 1.9.1-p376, ruby compiled from source using gcc with --enable-shared option
similar segmentation faults occur on both fedora 8 and ubuntu 9.04 jaunty
=end
Updated by jeremyevans0 (Jeremy Evans) over 5 years ago
- Description updated (diff)
- Status changed from Open to Closed