Bug #14561: Consistent 2.5.0 seg fault in GC, related to accessing an enumerator in a thread - Ruby - Ruby Issue Tracking System

Bug #14561

Updated by nobu (Nobuyoshi Nakada) over 7 years ago

This seg fault happens consistently on OSX (specifically I'm reproing it on a late 2015 Macbook pro running 10.13.3, but it seems to happen on similar machines as well). It happens only on Ruby 2.5.0. 

 Small repro case: 

 ```ruby #### 
 enum = Enumerator.new { |y| y << 1 } 
 thread = Thread.new { enum.peek }    # enum.next also causes the segfault, but not enum.size 
 thread.join 
 GC.start     # <- seg fault here 
 ``` #### 

 The C-level backtrace identifies this as within the mark phase of GC: 

 ``` 
 -- C level backtrace information ------------------------------------------- 
 0     ruby                                  0x000000010f77ced7 rb_vm_bugreport + 135 
 1     ruby                                  0x000000010f602628 rb_bug_context + 472 
 2     ruby                                  0x000000010f6f1491 sigsegv + 81 
 3     libsystem_platform.dylib              0x00007fff6a779f5a _sigtramp + 26 
 4     ruby                                  0x000000010f61bb93 rb_gc_mark_machine_stack + 99 
 5     ruby                                  0x000000010f76bf39 rb_execution_context_mark + 137 
 6     ruby                                  0x000000010f5ea32b cont_mark + 27 
 7     ruby                                  0x000000010f626a02 gc_marks_rest + 146 
 8     ruby                                  0x000000010f6253c0 gc_start + 2816 
 9     ruby                                  0x000000010f61d628 garbage_collect + 184 
 10    ruby                                  0x000000010f622215 gc_start_internal + 485 
 11    ruby                                  0x000000010f7703be vm_call_cfunc + 286 
 12    ruby                                  0x000000010f759af4 vm_exec_core + 12260 
 13    ruby                                  0x000000010f76ac8e vm_exec + 142 
 14    ruby                                  0x000000010f60c101 ruby_exec_internal + 177 
 15    ruby                                  0x000000010f60bff8 ruby_run_node + 56 
 16    ruby                                  0x000000010f592d1f main + 79 

 I also ran this against Ruby recompiled with -O0, and got a more detailed backtrace: 

 -- C level backtrace information ------------------------------------------- 
 0     libruby.2.5.dylib                     0x000000010c416e19 rb_print_backtrace + 25 
 1     libruby.2.5.dylib                     0x000000010c416f28 rb_vm_bugreport + 136 
 2     libruby.2.5.dylib                     0x000000010c2096f2 rb_bug_context + 450 
 3     libruby.2.5.dylib                     0x000000010c35b4ee sigsegv + 94 
 4     libsystem_platform.dylib              0x00007fff6a779f5a _sigtramp + 26 
 5     libruby.2.5.dylib                     0x000000010c2395a1 mark_locations_array + 49 
 6     libruby.2.5.dylib                     0x000000010c22a5bb gc_mark_locations + 75 
 7     libruby.2.5.dylib                     0x000000010c22a7d9 mark_stack_locations + 41 
 8     libruby.2.5.dylib                     0x000000010c22a79f rb_gc_mark_machine_stack + 79 
 9     libruby.2.5.dylib                     0x000000010c3f8868 rb_execution_context_mark + 264 
 10    libruby.2.5.dylib                     0x000000010c1e263e cont_mark + 46 
 11    libruby.2.5.dylib                     0x000000010c1e2572 fiber_mark + 146 
 12    libruby.2.5.dylib                     0x000000010c22f4c6 gc_mark_children + 1094 
 13    libruby.2.5.dylib                     0x000000010c23734c gc_mark_stacked_objects + 108 
 14    libruby.2.5.dylib                     0x000000010c237a5b gc_mark_stacked_objects_all + 27 
 15    libruby.2.5.dylib                     0x000000010c236cb1 gc_marks_rest + 129 
 16    libruby.2.5.dylib                     0x000000010c238787 gc_marks + 103 
 17    libruby.2.5.dylib                     0x000000010c2352e2 gc_start + 802 
 18    libruby.2.5.dylib                     0x000000010c22ca18 garbage_collect + 56 
 19    libruby.2.5.dylib                     0x000000010c231f7d gc_start_internal + 493 
 20    libruby.2.5.dylib                     0x000000010c401f2a call_cfunc_m1 + 42 
 21    libruby.2.5.dylib                     0x000000010c400d1d vm_call_cfunc_with_frame + 605 
 22    libruby.2.5.dylib                     0x000000010c3fc41d vm_call_cfunc + 173 
 23    libruby.2.5.dylib                     0x000000010c3fb8fe vm_call_method_each_type + 190 
 24    libruby.2.5.dylib                     0x000000010c3fb690 vm_call_method + 160 
 25    libruby.2.5.dylib                     0x000000010c3fb5e5 vm_call_general + 53 
 26    libruby.2.5.dylib                     0x000000010c3e784e vm_exec_core + 8974 
 27    libruby.2.5.dylib                     0x000000010c3f6fe6 vm_exec + 182 
 28    libruby.2.5.dylib                     0x000000010c3f7d5b rb_iseq_eval_main + 43 
 29    libruby.2.5.dylib                     0x000000010c214208 ruby_exec_internal + 232 
 30    libruby.2.5.dylib                     0x000000010c214111 ruby_exec_node + 33 
 31    libruby.2.5.dylib                     0x000000010c2140d0 ruby_run_node + 64 
 32    ruby                                  0x000000010c16ff2f main + 95 
 ``` 

 As far as I can tell, the C instruction triggering the segfault is here in gc.c (around line 4064): 

 ```C 
 static void 
 mark_locations_array(rb_objspace_t *objspace, register const VALUE *x, register long n) 
 { 
     VALUE v; 
     while (n--) { 
         v = *x;              // <----- Seems to be crashing here? 
         gc_mark_maybe(objspace, v); 
         x++; 
     } 
 } 
 ``` 

 Indicating a bad pointer in the machine stack. 

 I'm not sufficiently familiar with the VM internals to make much further progress, but I hope the repro case is helpful. It seems to require accessing an `Enumerator` Enumerator element within a separate thread, and then waiting for the thread to end.

Back

Project

General

Profile

Ruby

Bug #14561