Project

General

Profile

Actions

Backport #8805

closed

Ruby GC::Profiler returns incorrect info on Solaris (and relatives)

Added by sax (Eric Saxby) over 11 years ago. Updated over 11 years ago.


Description

We use SmartOS as our deployment platform, and noticed when attempting to roll out Ruby 2.0.0-p247 to some SmartOS hosts that garbage collection info is broken with integrations such as New Relic. Investigating further, we found that GC::Profiler.total_time always returns 0.0.

GC::Profiler.enable
=> nil
GC.start
=> nil
GC::Profiler.total_time
=> 0.0

It may be related to this issue: https://bugs.ruby-lang.org/issues/7500 where the mechanism by which the profiler gets timestamps from the OS.

Updated by sax (Eric Saxby) over 11 years ago

I grabbed the c test code from the other ticket, and with some changes got these results:

cat ./timing.c
#include <stdlib.h>
#include <stdio.h>
#include <sys/resource.h>
#include <time.h>
#include <math.h>

double getrusage_time() {
struct rusage usage;
struct timeval time;
getrusage(RUSAGE_SELF, &usage);
time = usage.ru_utime;
return time.tv_sec + time.tv_usec * 1e-6;
}

void print_clock_res(clock_id) {
int rc;
struct timespec res;

    rc = clock_getres(clock_id, &res);
    if (rc == 0)
            printf("clock_res() for %d: %ldns\n", clock_id, res.tv_nsec);

}

double clock_time(clock_id) {
struct timespec ts;

            if (clock_gettime(clock_id, &ts) == 0) {
                            return ts.tv_sec + ts.tv_nsec * 1e-9;
            }
            return 0.0;

}

void print_clock_time(clock_id) {
int n;

            printf("clock_gettime() for %d before: %f\n", clock_id, clock_time(clock_id));
            for (n=0; n<10000; n++) pow(2, 2048);
            printf("clock_gettime() for %d after: %f\n", clock_id, clock_time(clock_id));

}

int main() {
int n;

            printf("getrusage() before: %f\n", getrusage_time());
            for (n=0; n<10000; n++) pow(2, 2048);
            printf("getrusage() after: %f\n", getrusage_time());

printf("\nclock_gettime() CLOCK_HIGHRES\n");

print_clock_res(CLOCK_HIGHRES);
print_clock_time(CLOCK_HIGHRES);

printf("\nclock_gettime() CLOCK_PROCESS_CPUTIME_ID\n");

print_clock_res(CLOCK_PROCESS_CPUTIME_ID);
print_clock_time(CLOCK_PROCESS_CPUTIME_ID);

printf("\nclock_gettime() CLOCK_THREAD_CPUTIME_ID\n");

print_clock_res(CLOCK_THREAD_CPUTIME_ID);
print_clock_time(CLOCK_THREAD_CPUTIME_ID);

printf("\nclock_gettime() CLOCK_MONOTONIC\n");

print_clock_res(CLOCK_MONOTONIC);
print_clock_time(CLOCK_MONOTONIC);

}

gcc -o timing timing.c -lrt

./timing
getrusage() before: 0.000382
getrusage() after: 0.000432

clock_gettime() CLOCK_HIGHRES
clock_res() for 4: 20ns
clock_gettime() for 4 before: 3140535.103844
clock_gettime() for 4 after: 3140535.103877

clock_gettime() CLOCK_PROCESS_CPUTIME_ID
clock_gettime() for 5 before: 0.000000
clock_gettime() for 5 after: 0.000000

clock_gettime() CLOCK_THREAD_CPUTIME_ID
clock_gettime() for 2 before: 0.000000
clock_gettime() for 2 after: 0.000000

clock_gettime() CLOCK_MONOTONIC
clock_res() for 4: 20ns
clock_gettime() for 4 before: 3140535.103960
clock_gettime() for 4 after: 3140535.103990

Updated by sax (Eric Saxby) over 11 years ago

After more investigation, the clock_gettime calls for CLOCK_PROCESS_CPUTIME_ID return -1, with an errno of 22. I think this maps to: EINVAL The clock_id argument does not specify a known clock.

Sooo I think for SmartOS, and probably for Solaris in general, GC::Profiler should use getrusage() instead of clock_gettime(). I'm a little unclear on the best way to approach this, however. Maybe configure should test for a 0 response to clock_gettime, and set a flag specific to CLOCK_PROCESS_CPUTIME_ID that gc.c can pick up?

Actions #3

Updated by akr (Akira Tanaka) over 11 years ago

  • Status changed from Open to Closed
  • % Done changed from 0 to 100

This issue was solved with changeset r42639.
Eric, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.


  • gc.c (getrusage_time): Fallback clock_gettime to getrusage when
    clock_gettime fails.
    Reported by Eric Saxby. [ruby-core:56762] [Bug #8805]

Updated by akr (Akira Tanaka) over 11 years ago

  • Status changed from Closed to Open
  • Backport changed from 1.9.3: UNKNOWN, 2.0.0: UNKNOWN to 1.9.3: DONTNEED, 2.0.0: REQUIRED

Updated by sax (Eric Saxby) over 11 years ago

Fantastic! I've rolled out this patch to some SmartOS hosts this morning and it fixes the GC::Profiler total_time reporting.

Thank you so much!

Updated by sax (Eric Saxby) over 11 years ago

Hi Akira,

I have one question. It looks like this has only been applied to trunk, and not to the ruby_2_0_0 branch. I'm not yet totally familiar with the version control workflow of ruby-core, but does that mean that this will be released in 2.1? Will it also be in the next patch release of 2.0?

Thank you!

Updated by akr (Akira Tanaka) over 11 years ago

2013/8/22 sax (Eric Saxby) :

I have one question. It looks like this has only been applied to trunk, and not to the ruby_2_0_0 branch. I'm not yet totally familiar with the version control workflow of ruby-core, but does that mean that this will be released in 2.1? Will it also be in the next patch release of 2.0?

I think this should be backported to Ruby 2.0.0.

So I reopened this issue and updated the backport field as:

The Ruby 2.0.0 maintainer (nagachika) may backport the patch.

Tanaka Akira

Actions #8

Updated by nagachika (Tomoyuki Chikanaga) over 11 years ago

  • Tracker changed from Bug to Backport
  • Project changed from Ruby master to Backport200
  • Category deleted (core)
  • Status changed from Open to Assigned
  • Assignee set to nagachika (Tomoyuki Chikanaga)
Actions #9

Updated by nagachika (Tomoyuki Chikanaga) over 11 years ago

  • Status changed from Assigned to Closed

This issue was solved with changeset r42653.
Eric, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.


merge revision(s) 42639: [Backport #8805]

* gc.c (getrusage_time): Fallback clock_gettime to getrusage when
  clock_gettime fails.
  Reported by Eric Saxby.  [ruby-core:56762] [Bug #8805]

Updated by sax (Eric Saxby) over 11 years ago

Thank you both!

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0