Project

General

Profile

Feature #20351

Updated by eightbitraptor (Matthew Valentine-House) about 1 month ago

**UPDATE: Based on feedback on the original PR (thank you @katei and @nobu) we have 
 changed our approach to this project.** 

 **An updated PR can be found here: [[GH #10456]](https://github.com/ruby/ruby/pull/10456)** 

 --- 

 ~~[Github PR#10302](https://github.com/ruby/ruby/pull/10302)~~ [Github PR#10302](https://github.com/ruby/ruby/pull/10302) 

 **NOTE: This proposal does not change the default build of Ruby, and therefore 
 should NOT cause performance degradation for Ruby built in the usual way** 

 Our long term goal is to standardise Ruby's GC interface, allowing alternative 
 GC implementations to be used. This will be acheived by optionally building 
 Ruby's GC as a shared object; enabling it to be replaced at runtime using using 
 `LD_LIBRARY_PATH`. eg: 

 ``` 
 LD_LIBRARY_PATH=/custom_gc_location ruby script.rb 
 ``` 

 This ticket proposes the first step towards this goal. A new experimental build 
 option, `--enable-shared-gc`, that will compile and link a module into the built 
 `ruby` binary as a shared object - `miniruby` will remain statically linked to 
 the existing GC in all cases. 


 Similar methods of replacing functionality relied on by Ruby have 
 precedent. `jemalloc` uses `LD_PRELOAD` to replace `glibc` provided `malloc` and 
 `free` at runtime. Although this project will be the first time a technique such 
 as this has been used to replace core Ruby functionality. 

 This flag will be marked as experimental & **disabled by default**. 

 [The PR linked from this ticket](https://github.com/ruby/ruby/pull/10302) implements the new build flag, along with the 
 absolute minimum code required to test it's implementation (a single debug 
 function). 

 The implementation of the new build flag is based on the existing implementation 
 of `--enable-shared` and behaves as follows: 

 - `--enable-shared --enable-shared-gc` 
  
   This will build both `libruby` and `librubygc` as shared objects. `ruby` will 
   link dynamically to both `libruby` and `librubygc`. 
  
 - `--disable-shared --enable-shared-gc` 

   This will build `librubygc` as a shared object, and build `libruby` as a 
   static object. `libruby` will link dynamically to `librubygc` and `ruby` will 
   be statically linked to `libruby`. 
  
 - `--disable-shared-gc` 

   **This will be the default**, and when this case is true the build behaviour 
   will be exactly the same as it is currently. ie. the existing Ruby GC will be 
   built and linked statically into either `ruby` or `libruby.so` depending on 
   the state of `--enable-shared`. 
  
 We are aware that there will be a small performance penalty from moving the GC 
 logic into a shared object, but this is an opt-in configuration turned on at 
 build time intended to be used by experienced users.  

 Still, we anticipate that, even with this configuration turned on, this penalty 
 will be negligible compared the the benefit that being able to use high 
 performance GC algorithms will provide. 

 This performance penalty is also the reason that **this feature will be disabled 
 by default**. There will be no performance impact for anyone compiling Ruby in 
 the usual manner, without explicitly enabling this feature. 

 We have discussed this proposal with @matz who has approved our work on this 
 project - having a clear abstraction between the VM and the GC will enable us to 
 iterate faster on improvements to Ruby's existing GC. 

 ## Motivation 

 In the long term we want to provide the ability to override the current Ruby GC 
 implementation in order to: 

 * Experiment with modern high-performance GC implementations, such as Immix, G1, 
   LXR etc. 
 * Easily split-test changes to the GC, or the GC tuning, in production without 
   having to rebuild Ruby 
 * Easily use debug builds of the GC to help identify production problems and 
   bottlenecks without having to rebuild Ruby 
 * Encourage the academic memory management research community to consider Ruby 
   for their research (the current work on [MMTk & Ruby](https://github.com/mmtk/mmtk-ruby) is a good example of 
   this). 

 ## Future work 

 The initial implementation of the shared GC module in this PR is deliberately 
 small, and exists only for testing the build system integration. 

 The next steps are to identify boundaries between the GC and the VM and begin to 
 extract common functionality into this GC wrapper module to serve as the 
 foundation of our GC interface. 

 ## Who's working on this 
  
 - @eightbitraptor 
 - @tenderlovemaking  
 - @peterzhu2118 
 - @eileencodes

Back