Bug #2025: problem with pthread handling on non NPTL platform - Ruby - Ruby Issue Tracking System

Actions

Copy link

Bug #2025

closed

problem with pthread handling on non NPTL platform

Added by Petr.Salinger@seznam.cz (Petr Salinger) almost 16 years ago. Updated almost 13 years ago.

Status:

Closed

Assignee:

mame (Yusuke Endoh)

Target version:

2.0.0

ruby -v:

1.9.1.243

Backport:

[ruby-core:25217]

Description

=begin
I tried to fix some testsuite failures on GNU/kFreeBSD,
http://bugs.debian.org//cgi-bin/bugreport.cgi?bug=542927.
I observed some problems in the pthread related code.
The hang in 1st test in
http://redmine.ruby-lang.org/issues/show/1525
also applies for us.

IMO, the ruby should try to work under any POSIX pthread
conforming implementation, not only NPTL.
The code audit in this area seems needed.

There are some problems with handling of fork()/exec().
There really should be reinitialization of locks in child,
the timer should be started using pthread_once(), the current
approach is fragile and might lead to start of more timer threads.
http://www.opengroup.org/onlinepubs/9699919799/functions/pthread_once.html

In general, I do not understand how code in thread_pthread.c:

static pthread_t timer_thread_id;
static pthread_cond_t timer_thread_cond = PTHREAD_COND_INITIALIZER;
static pthread_mutex_t timer_thread_lock = PTHREAD_MUTEX_INITIALIZER;
rb_thread_create_timer_thread()
thread_timer()

could survive correctly fork(), see also
http://www.opengroup.org/onlinepubs/009695399/functions/pthread_atfork.html

I really doubt the following code in process.c
for rb_f_fork(VALUE obj) is correct:

  switch (pid = rb_fork(0, 0, 0, Qnil)) {
    case 0:

#ifdef linux
after_exec();
#endif
rb_thread_atfork();
if (rb_block_given_p()) {
int status;

          rb_protect(rb_yield, Qundef, &status);
          ruby_stop(status);
      }

The conditional after_exec() shouldn't be here.
There is already "after_fork()" at line 2331,
which is executed for both parent and child.
The exception is when chfunc is not NULL,
then it is not executed at all.

The bug is timing dependent, i.e. there is a race condition.
Sometimes the child process would have 2 timer threads, sometimes
it would have the expected 1.

Only the probability of 2 is higher on linuxthreads compared to NPTL,
but it can happen under any pthread implementation.

Ruby should not use PTHREAD_CREATE_DETACHED and after that use pthread_join.
http://www.opengroup.org/onlinepubs/9699919799/functions/pthread_join.html:
"The behavior is undefined if the value specified by the thread argument
to pthread_join() does not refer to a joinable thread."

Ruby should use pthread_sigmask() instead of sigprocmask()
when available and so on.
http://www.opengroup.org/onlinepubs/9699919799/functions/pthread_sigmask.html:
"The use of the sigprocmask() function is unspecified in a

This would work correctly on both linuxthreads/NPTL and should on any
POSIX pthread conforming implementation.
Ideally, ruby would not require full conformance, but also
accept some known exceptions, like our getpid() difference.
=end

Related issues 1 (0 open — 1 closed)

Actions

Copy link

Also available in: Atom PDF

Like0

Like0Like0Like0Like0

Project

General

Profile

Ruby

Tags

Custom queries

Bug #2025

problem with pthread handling on non NPTL platform

Updated by mame (Yusuke Endoh) over 15 years ago

Updated by Petr.Salinger@seznam.cz (Petr Salinger) over 15 years ago

Updated by mame (Yusuke Endoh) over 15 years ago

Updated by mame (Yusuke Endoh) over 15 years ago