https://redmine.ruby-lang.org/https://redmine.ruby-lang.org/favicon.ico?17113305112020-04-24T06:07:58ZRuby Issue Tracking SystemRuby master - Bug #16814: Segmentation fault in GC while running test/ruby/test_fiber.rb on s390xhttps://redmine.ruby-lang.org/issues/16814?journal_id=852742020-04-24T06:07:58Znobu (Nobuyoshi Nakada)nobu@ruby-lang.org
<ul></ul><p>Where does <code>any->as.data.free</code> point?<br>
Is <code>any->as.basic.klass</code> a valid class object?<br>
If you compile gc.c as <code>make DEFS=-DGC_DEBUG gc.o</code>, <code>any->file</code> and <code>any->line</code> have the location in ruby level, and could help you.</p> Ruby master - Bug #16814: Segmentation fault in GC while running test/ruby/test_fiber.rb on s390xhttps://redmine.ruby-lang.org/issues/16814?journal_id=852752020-04-24T06:38:22Zmame (Yusuke Endoh)mame@ruby-lang.org
<ul></ul><blockquote>
<p>disappeared on April 15.</p>
</blockquote>
<p>You may know, but the test has been skipped on s390x since 9948addda67f4b7a6e3575f1eba9025f998811d2.</p> Ruby master - Bug #16814: Segmentation fault in GC while running test/ruby/test_fiber.rb on s390xhttps://redmine.ruby-lang.org/issues/16814?journal_id=852892020-04-24T23:05:24ZReiOdaira (Rei Odaira)Rei.Odaira@gmail.com
<ul></ul><p>Did you mean<code>any->as.data.dfree</code>? It points to no valid location.</p>
<pre><code>(gdb) print any->as.data
$4 = {basic = {flags = 12, klass = 2930849422520}, dmark = 0x0, dfree = 0x1,
data = 0x2aa6449f9e0}
(gdb) print any->as.typeddata
$5 = {basic = {flags = 12, klass = 2930849422520}, type = 0x0, typed_flag = 1,
data = 0x2aa6449f9e0}
</code></pre>
<p><code>any->as.basic.klass</code> seems to be a valid class. Is there any way to figure out what class it is using the core dump file?</p>
<pre><code>(gdb) print *(struct RBasic *)any->as.basic.klass
$7 = {flags = 98, klass = 2930849422480}
(gdb) print ((struct RBasic *)any->as.basic.klass)->flags & 0x1f
$9 = 2
</code></pre>
<p>I've tried <code>make DEFS=-DGC_DEBUG gc.o</code>. It made the test fail quite less often than before, and when it failed, it did at a different location in GC (gc.c:5240), but it will help a lot. Thanks.</p>
<p>Thanks, Endoh-san, I didn't know the test was skipped.</p> Ruby master - Bug #16814: Segmentation fault in GC while running test/ruby/test_fiber.rb on s390xhttps://redmine.ruby-lang.org/issues/16814?journal_id=852962020-04-26T17:13:51Zmame (Yusuke Endoh)mame@ruby-lang.org
<ul></ul><p>FYI: I re-enabled the test in question with 93ed465dcdc866013cd93c3662937497900c8086</p> Ruby master - Bug #16814: Segmentation fault in GC while running test/ruby/test_fiber.rb on s390xhttps://redmine.ruby-lang.org/issues/16814?journal_id=856192020-05-14T10:50:42Zioquatix (Samuel Williams)samuel@oriontransfer.net
<ul></ul><p><a class="user active user-mention" href="https://redmine.ruby-lang.org/users/18">@mame (Yusuke Endoh)</a> I have merged the light weight concurrency patch, and it included some changes to these tests to make them less flaky, by putting it in separate test file. In my experience it seems much more reliable now. Just FYI.</p> Ruby master - Bug #16814: Segmentation fault in GC while running test/ruby/test_fiber.rb on s390xhttps://redmine.ruby-lang.org/issues/16814?journal_id=857852020-05-25T02:08:58Zioquatix (Samuel Williams)samuel@oriontransfer.net
<ul></ul><p>Can you check if this is still a problem, I merged my changes which should make this test more reliable. But I did not fix any underlying problems.</p> Ruby master - Bug #16814: Segmentation fault in GC while running test/ruby/test_fiber.rb on s390xhttps://redmine.ruby-lang.org/issues/16814?journal_id=859102020-05-31T07:42:22ZReiOdaira (Rei Odaira)Rei.Odaira@gmail.com
<ul></ul><p>On s390x, <code>FIBER_POOL_ALLOCATION_FREE</code> is enabled. The doubly linked list of <code>fiber_pool->vacancies</code> assumes that the head <code>fiber_pool_vacancy</code> has <code>NULL</code> in its <code>previous</code> field. However, when a fiber is released, <code>fiber_pool_vacancy_push()</code> called from <code>fiber_pool_stack_release()</code> does not store <code>NULL</code> to <code>vacancy->previous</code>.</p>
<p>Why this caused the observed symptom:<br>
As <code>test_stack_size</code> uses up the VM stack of the fiber, it writes something into the memory location where <code>struct fiber_pool_vacancy</code> would reside if the stack were free. When the fiber is released, the stack's <code>fiber_pool_vacancy</code> is retuned to the head of the <code>vacancies</code> doubly linked list, and then <code>fiber_pool_allocation_free()</code> is triggered. <code>fiber_pool_vacancy_remove()</code> manipulates the doubly linked list, and the <code>vacancy->previous</code> of the released fiber should have been <code>NULL</code> because it is at the head of the list.</p>
<pre><code> if (vacancy->previous) {
vacancy->previous->next = vacancy->next;
}
</code></pre>
<p>However, since <code>vacancy->previous</code> contains arbitrary data, the code snippet above destroys the memory location that happens to be pointed to by <code>vacancy->previous</code>. In <code>test_stack_size</code>, <code>vacancy->previous</code> happens to point to an encoding object that is live, and <code>vacancy->next</code> happens to be 0. This means <code>vacancy->previous->next = vacancy->next;</code> writes 0 into the <code>as.typeddata.type</code> field of the live object. This finally leads to the segmentation fault during GC.</p> Ruby master - Bug #16814: Segmentation fault in GC while running test/ruby/test_fiber.rb on s390xhttps://redmine.ruby-lang.org/issues/16814?journal_id=859112020-05-31T08:21:48Zioquatix (Samuel Williams)samuel@oriontransfer.net
<ul></ul><p><a class="user active user-mention" href="https://redmine.ruby-lang.org/users/7848">@ReiOdaira (Rei Odaira)</a> thanks for your careful analysis. It is very useful! I will review the code and get back to you.</p> Ruby master - Bug #16814: Segmentation fault in GC while running test/ruby/test_fiber.rb on s390xhttps://redmine.ruby-lang.org/issues/16814?journal_id=859822020-06-04T10:05:42Zioquatix (Samuel Williams)samuel@oriontransfer.net
<ul></ul><pre><code>inline static struct fiber_pool_vacancy *
fiber_pool_vacancy_push(struct fiber_pool_vacancy * vacancy, struct fiber_pool_vacancy * head)
{
vacancy->next = head;
#ifdef FIBER_POOL_ALLOCATION_FREE
if (head) {
head->previous = vacancy;
vacancy->previous = NULL; // added
}
#endif
return vacancy;
}
</code></pre>
<p><a class="user active user-mention" href="https://redmine.ruby-lang.org/users/7848">@ReiOdaira (Rei Odaira)</a> do you think that's sufficient?</p> Ruby master - Bug #16814: Segmentation fault in GC while running test/ruby/test_fiber.rb on s390xhttps://redmine.ruby-lang.org/issues/16814?journal_id=859832020-06-04T10:38:53Zioquatix (Samuel Williams)samuel@oriontransfer.net
<ul></ul><p><a href="https://github.com/ruby/ruby/pull/3182" class="external">https://github.com/ruby/ruby/pull/3182</a></p> Ruby master - Bug #16814: Segmentation fault in GC while running test/ruby/test_fiber.rb on s390xhttps://redmine.ruby-lang.org/issues/16814?journal_id=859852020-06-04T11:11:02Zioquatix (Samuel Williams)samuel@oriontransfer.net
<ul></ul><p>By the way, I've also removed all skips when I rewrote tests into <code>test/ruby/test_stack.rb</code>.</p> Ruby master - Bug #16814: Segmentation fault in GC while running test/ruby/test_fiber.rb on s390xhttps://redmine.ruby-lang.org/issues/16814?journal_id=859872020-06-04T23:42:10Zioquatix (Samuel Williams)samuel@oriontransfer.net
<ul><li><strong>Status</strong> changed from <i>Open</i> to <i>Assigned</i></li><li><strong>Assignee</strong> set to <i>ioquatix (Samuel Williams)</i></li></ul><p>I have merged this.</p>
<p><a class="user active user-mention" href="https://redmine.ruby-lang.org/users/7848">@ReiOdaira (Rei Odaira)</a> thanks for your effort, you deserve all the credit for tracking down this issue.</p>
<p>Can you please confirm whether the original issue is fixed? If so, we can close this issue.</p>
<p>Thanks!</p> Ruby master - Bug #16814: Segmentation fault in GC while running test/ruby/test_fiber.rb on s390xhttps://redmine.ruby-lang.org/issues/16814?journal_id=860042020-06-06T02:23:28Znagachika (Tomoyuki Chikanaga)nagachika00@gmail.com
<ul><li><strong>Backport</strong> changed from <i>2.5: UNKNOWN, 2.6: UNKNOWN, 2.7: UNKNOWN</i> to <i>2.5: DONTNEED, 2.6: DONTNEED, 2.7: REQUIRED</i></li></ul> Ruby master - Bug #16814: Segmentation fault in GC while running test/ruby/test_fiber.rb on s390xhttps://redmine.ruby-lang.org/issues/16814?journal_id=906082021-02-26T20:20:53Zjeremyevans0 (Jeremy Evans)merch-redmine@jeremyevans.net
<ul><li><strong>Status</strong> changed from <i>Assigned</i> to <i>Closed</i></li></ul> Ruby master - Bug #16814: Segmentation fault in GC while running test/ruby/test_fiber.rb on s390xhttps://redmine.ruby-lang.org/issues/16814?journal_id=910182021-03-20T06:59:17Znagachika (Tomoyuki Chikanaga)nagachika00@gmail.com
<ul><li><strong>Backport</strong> changed from <i>2.5: DONTNEED, 2.6: DONTNEED, 2.7: REQUIRED</i> to <i>2.5: DONTNEED, 2.6: DONTNEED, 2.7: DONE</i></li></ul><p>ruby_2_7 755a349a3a66f5731995296fe3bb7d2b1712167f merged revision(s) 4bff8e84232594ecb9914e2a8437b7c40a63b799.</p>