Project

General

Profile

Actions

Bug #11922

closed

[PATCH] fix ASYNC BUG race from bootstraptest/test_fork.rb

Bug #11922: [PATCH] fix ASYNC BUG race from bootstraptest/test_fork.rb

Added by normalperson (Eric Wong) almost 10 years ago. Updated over 9 years ago.

Status:
Closed
Assignee:
-
Target version:
-
[ruby-core:72590]

Description

thread_pthread.c (rb_thread_create_timer_thread): fix race

This fixes an occasional [ASYNC BUG] failure in
bootstraptest/test_fork.rb '[ruby-dev:37934]'
which tests fork/pthread_create failure by setting
RLIMIT_NPROC to 1 and triggering EAGAIN on pthread_create
when attempting to recreate the timer thread.

The problem timeline is as follows:

thread 1                           thread 2
---------------------------------------------------------------
rb_thread_create_timer_thread
setup_communication_pipe
                                   rb_thread_wakeup_timer_thread_low
pthread_create fails               pipe looks valid, write!
CLOSE_INVALIDATE (x4)              EBADF -> ASYNC BUG

The checks in rb_thread_wakeup_timer_thread_low only tried to
guarantee proper ordering with native_stop_timer_thread, not
rb_thread_create_timer_thread :x

Now, this should allow rb_thread_create_timer_thread to
synchronize properly with rb_thread_wakeup_timer_thread_low by
delaying the validation marking of the timer_thread_pipe until
we are certain the timer thread is alive.

In this version, rb_thread_wakeup_timer_thread_low becomes a
noop.  Threading is still completely broken with NPROC==1, but
there's not much we can do about it beside warn the user.
We no longer spew a scary [ASYNC BUG] message at them and
dump core on them.

Note: testing this overnight with the [ruby-dev:37934] bit extracted
from bootstraptest/test_fork.rb

	  main = Thread.current
	  Thread.new { sleep 0.01 until main.stop?; Thread.kill main }
	  Process.setrlimit(:NPROC, 1)
	  fork {}

This bug seems easier to reproduce on my weak VM with 32-bit luserspace
(64-bit kernel) VM than more powerful machines.  Even without this
patch, it could take hours to reproduce the race.  I haven't been able
to reproduce this bug at all on my Phenom II machine.

Way too tired to be committing this right now...

Files

Updated by Anonymous almost 10 years ago Actions #1

  • Status changed from Open to Closed

Applied in changeset r53373.


thread_pthread.c (rb_thread_create_timer_thread): fix race

This fixes an occasional [ASYNC BUG] failure in
bootstraptest/test_fork.rb '[ruby-dev:37934]'
which tests fork/pthread_create failure by setting
RLIMIT_NPROC to 1 and triggering EAGAIN on pthread_create
when attempting to recreate the timer thread.

The problem timeline is as follows:

thread 1 thread 2

rb_thread_create_timer_thread
setup_communication_pipe
rb_thread_wakeup_timer_thread_low
pthread_create fails pipe looks valid, write!
CLOSE_INVALIDATE (x4) EBADF -> ASYNC BUG

The checks in rb_thread_wakeup_timer_thread_low only tried to
guarantee proper ordering with native_stop_timer_thread, not
rb_thread_create_timer_thread :x

Now, this should allow rb_thread_create_timer_thread to
synchronize properly with rb_thread_wakeup_timer_thread_low by
delaying the validation marking of the timer_thread_pipe until
we are certain the timer thread is alive.

In this version, rb_thread_wakeup_timer_thread_low becomes a
noop. Threading is still completely broken with NPROC==1, but
there's not much we can do about it beside warn the user.
We no longer spew a scary [ASYNC BUG] message or dump core
on them.

  • thread_pthread.c (setup_communication_pipe): delay setting owner
    (rb_thread_create_timer_thread): until thread creation succeeds
    [ruby-core:72590] [Bug #11922]

Updated by naruse (Yui NARUSE) over 9 years ago Actions #2 [ruby-core:74717]

  • Backport changed from 2.0.0: UNKNOWN, 2.1: UNKNOWN, 2.2: UNKNOWN, 2.3: REQUIRED to 2.0.0: UNKNOWN, 2.1: UNKNOWN, 2.2: UNKNOWN, 2.3: DONE

ruby_2_3 r54426 merged revision(s) 53373.

Actions

Also available in: PDF Atom