Project

General

Profile

Actions

Bug #15050

closed

GC after forking with fibers crashes

Added by normalperson (Eric Wong) over 5 years ago. Updated over 5 years ago.

Status:
Closed
Target version:
-
[ruby-core:88762]

Description

Not sure when to work on this, so leaving this here for now...

The management of stacks for root fiber and regular
fibers is different and confusing. Perhaps unifying
thread and fiber stack cache is the best way to go.

Is a separate class of stacks even necessary?
We should aim to minimize use of stack space.

"[BUG] Illegal root fiber parameter"


Files

Updated by normalperson (Eric Wong) over 5 years ago

wrote:

https://bugs.ruby-lang.org/issues/15050

This fixes the immediate bug, but I think there's other problems
with stack management

diff --git a/cont.c b/cont.c
index 7bc1724176..ab42dfb27a 100644
--- a/cont.c
+++ b/cont.c
@@ -1983,6 +1983,7 @@ rb_fiber_atfork(rb_thread_t *th)
{
if (th->root_fiber) {
if (&th->root_fiber->cont.saved_ec != th->ec) {
+            th->root_fiber->cont.type = FIBER_CONTEXT;
th->root_fiber = th->ec->fiber_ptr;
th->root_fiber->cont.type = ROOT_FIBER_CONTEXT;
}

Updated by ko1 (Koichi Sasada) over 5 years ago

Is this because [Bug #15041]?

Updated by normalperson (Eric Wong) over 5 years ago

wrote:

Is this because [Bug #15041]?

Yes, fix one bug, hit another :< Story of my life

(I found these while working on auto-fiber)

Updated by normalperson (Eric Wong) over 5 years ago

Koichi Sasada wrote:

On 2018/08/31 12:30, Eric Wong wrote:

Yes, fix one bug, hit another :< Story of my life

[Bug #15041] hits something wrong?
(sorry I needed to ask earlier)

No, it's not wrong; but it is an incomplete fix. There
seems to be several problems related to fork + Fiber

Updated by ko1 (Koichi Sasada) over 5 years ago

On 2018/08/31 12:50, Eric Wong wrote:

[Bug #15041] hits something wrong?
(sorry I needed to ask earlier)

No, it's not wrong; but it is an incomplete fix. There
seems to be several problems related to fork + Fiber

Sorry my question is wrong.

What is the problem [Bug #15041] want to solve?

I remember that there are several implicit assumption on root fiber and
so on. I think changing this attribute is bad idea now.

--
// SASADA Koichi at atdot dot net

Updated by normalperson (Eric Wong) over 5 years ago

Koichi Sasada wrote:

On 2018/08/31 12:50, Eric Wong wrote:

[Bug #15041] hits something wrong?
(sorry I needed to ask earlier)

No, it's not wrong; but it is an incomplete fix. There
seems to be several problems related to fork + Fiber

Sorry my question is wrong.

What is the problem [Bug #15041] want to solve?

Switching fiber can crash in child process. r64589 test change
shows it:

-      Fiber.new{ pid = fork {} }.resume
+      Fiber.new do
+        pid = fork do
+          Fiber.new {}.transfer
+        end
+      end.resume

I remember that there are several implicit assumption on root fiber and so
on. I think changing this attribute is bad idea now.

I think we need to change to avoid crashes. We change vm->main_thread
at fork, too. Maybe I'can investigate late next week.

Updated by ko1 (Koichi Sasada) over 5 years ago

On 2018/08/31 15:14, Eric Wong wrote:

What is the problem [Bug #15041] want to solve?

Switching fiber can crash in child process. r64589 test change
shows it:

-      Fiber.new{ pid = fork {} }.resume
+      Fiber.new do
+        pid = fork do
+          Fiber.new {}.transfer
+        end
+      end.resume

I remember that there are several implicit assumption on root fiber and so
on. I think changing this attribute is bad idea now.

I think we need to change to avoid crashes. We change vm->main_thread
at fork, too. Maybe I'can investigate late next week.

Ok. We need to avoid this kind of crash.

--
// SASADA Koichi at atdot dot net

Updated by normalperson (Eric Wong) over 5 years ago

Koichi Sasada wrote:

Ok. We need to avoid this kind of crash.

https://bugs.ruby-lang.org/issues/15050

OK, I think I fixed the issue (also including [Feature #15095])

Greg, there's a small win32-specific change in this series which
seems low risk, but I figured I'd Cc you anyways for testing.
Thanks.

Pull request:

The following changes since commit 64b326c2204e562aa3d6025ca097a82bcf4de303:

spec/ruby/library/socket/addrinfo: require for SocketSpecs (2018-09-09 08:50:53 +0000)

are available in the Git repository at:

https://80x24.org/ruby.git fiber-fork

for you to fetch changes up to ce0a6005fbf78203b3bcaa4b90bf94b8b5c44782:

fiber: fix crash on GC after forking (2018-09-09 09:49:55 +0000)

Or broken out patches:

Or full diff against current trunk (r64663):

https://80x24.org/spew/20180909100319.ild2rlncrmmm2lwp@whir/raw

Actions #9

Updated by normalperson (Eric Wong) over 5 years ago

  • Status changed from Open to Closed

Applied in changeset trunk|r64703.


share VM stack between threads and fibers if identical in size

ec->vm_stack is always allocated with malloc, so stack cache for
root fiber (thread stack) and non-root fibers can be shared as
long as the size is the same. The purpose of this change is to
reduce dependencies on ROOT_FIBER_CONTEXT.

[Feature #15095] [Bug #15050]

v2: vm.c: fix build with USE_THREAD_DATA_RECYCLE==0

Updated by MSP-Greg (Greg L) over 5 years ago

@normalperson (Eric Wong)

Eric,

Sorry been busy with 'this is a bigger mess than I thought' kinds of things.

I just ran ruby-loco on r64706 'fiber: fix crash on GC after forking', and the build passed.

Thanks, Greg

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0