Project

General

Profile

Feature #19874

Updated by eightbitraptor (Matt V-H) about 1 year ago

See Github PR: [#8420](https://github.com/ruby/ruby/pull/8420) 

 <hr /> 

 ## Introduction 

 In Ruby 2.7 compaction was originally introduced as an experimental feature. 
 There was no auto-compaction, and compaction had to be manually triggered using 
 `GC.compact`. 

 See [this commit](https://github.com/ruby/ruby/commit/3ef4db15e95740839a0ed6d0224b2c9562bb2544). 

 When this happened, at the start of compaction the heap was sorted so that pages 
 containing the [most pinned slots were compacted into by 
 default](https://github.com/ruby/ruby/commit/3ef4db15e95740839a0ed6d0224b2c9562bb2544#diff-d1cee85c3b0e24a64519c11150abe26fd0b5d8628a23d356dd0b535ac4595d49R7409). 

 This was important for compaction efficiency and copy on write friendliness as 
 it prioritises filling pages that can't be free'd and emptying as many pages as possible. 

 When `GC.auto_compact`, was introduced [in this commit](). This heap sorting 
 mechanism was (unintentionally?) removed. 

 This means that the heap is now unsorted when compacted and we're just 
 compacting towards the lowest addresses from the higher ones. 

 This PR re-introduces this feature: by default the heap will now be sorted by 
 pinned object count prior to compaction. 

 This will improve the CoW friendliness of compaction to aid the reforking work 
 being carried out by @byroot. byroot. 

 ## Verifying the behaviour 

 We used a test script to build a heap that would be laid out in a prescribed 
 manner. with a predictable pinning pattern in order to test that pages with 
 higher numbers of pinned objects are being filled first during compaction. 

 Test script was as follows: 

 ``` 
 require 'fiddle' 
 require 'objspace' 

 objcount = 1_000_000 

 GC.disable 
 list = [] 

 # Create sparsely populated pages 
 32000.times do 
   list << Object.new 
   Object.new 
   Object.new 
   Object.new 
   Object.new 
   Object.new 
   Object.new 
 end 

 # less sparsely populated pages 
 32000.times do 
   list << Object.new 
   Object.new 
   Object.new 
   Object.new 
 end 

 # densly populated pages 
 32000.times do 
   list << Object.new 
   Object.new 
 end 

 # packed pages 
 32000.times do 
   list << Object.new 
 end 

 # Pin the set of objects and free the rest in order to create our controlled heap 
 pinned = list.map { |x| Fiddle::Pinned.new(x) } 
 GC.start 

 # Dump the heap before compaction, this should contain lots of pages that contain 
 # only pinned slots, in varying density patterns, followed by a solid chunk of 
 # movable objects in the 40 byte size pool (these are the objects created by 
 # `Fiddle::Pinned.new`). 
 f = File.open("before.log", "w") 
 ObjectSpace.dump_all(shapes: false, output: f) 

 GC.compact 
 GC.start 
 puts `ps -o rss= -p #{$$}` 

 # Dump the heap after compaction, this should show the same density pattern that we saw 
 # in the previous dump, but this time the solid region of 40 byte objects should be compacted 
 # into the pages with the most pinned objects first 
 f = File.open("after.log", "w") 
 ObjectSpace.dump_all(shapes: false, output: f) 

 ``` 

 ## Heap diagram before 

 <img src="https://github.com/ruby/ruby/assets/31869/bffc0090-4790-477d-9476-bea5203fd9c2" width=400 /> 

 This diagram shows heap pages vertically with the pinned object distribution pattern set up by the test script. 

 ## Heap diagram after (master branch) 

 <img src="https://github.com/ruby/ruby/assets/31869/3d3b8ff5-1625-4e05-b832-95d28f1a2fe5" width=400 /> 

 This diagram shows the heap having been compacted without this patch. We can see that the heap has not been sorted by pages, and that movable objects (green) are being compacted into pages that contain fewer pinned objects, and leaving pages that contain lots of pinned objects sparsely populated. 

 ## Heap diagram after (this branch) 

 <img src="https://github.com/ruby/ruby/assets/31869/5ebdd282-b167-4ae4-b718-15fb095cf364" width=400 /> 

 This diagram shows the heap having been compacted into pinned pages. This diagram shows pages sorted by their order in the `heap->pages` list. So will respect the sorting that happens during compaction. 

 ## Benchmarking 

 ### Compaction speed 

 This plot shows mean and standard deviation of `GC.compact` time for 6 runs of the test script defined above on both this branch and master. Measured using `Benchmark.bm` 

 <img src="https://github.com/ruby/ruby/assets/31869/fbe23946-08dd-4d8f-890c-5ab9be2dd50c" width=640 /> 

 Raw values are: 

 | branch name | mean           | std deviation | 
 | ----------- | ------------ | ------------- | 
 | this branch | 0.0125643333 | 0.0004797644    | 
 | master        | 0.0121891667 | 0.0008149953    | 

 This data appears shows a 3% slowdown in GC compation times on this branch which is expected given the extra work required to sort the heap prior to compaction. 

 However, given the sizeable overlap in the error bars we can conclude that this difference is not significant. 

 ### Post compaction RSS 

 We used a modified test script that created a heap with a lot of pinned slots at the end of the heap. 

 The expected outcome is that if the heap remains unsorted, RSS will be higher because objects will be packed into the front of the heap, and then followed by sparsely populated pages of pinned objects. 

 In a sorted heap, the objects will be packed in around the pages of pinned objects where possible allowing more empty pages to be free'd and RSS to be lower. 

 Here is the test script used: 

 ``` 
 require 'fiddle' 

 GC.disable 
 list = [] 

 288000.times do 
   Object.new 
 end 

 36000.times do 
   list << Object.new 
   50.times do 
     Object.new 
   end 
 end 

 pinned = list.map { |x| Fiddle::Pinned.new(x) } 
 GC.start 
 GC.compact 
 puts `ps -o rss= -p #{$$}` 

 ``` 

 And here is the outcome: 

 <img src="https://github.com/ruby/ruby/assets/31869/96716838-4d81-41fd-a91f-4704bed9257e" width=640 /> 

 Raw data: 

 | branch name | mean         | std deviation | 
 | ----------- | ---------- | ------------- | 
 | this branch | 92200.0000 | 332.9769        | 
 | master        | 93594.6667 | 188.9536        | 

 We can see from this data that sorting the heap by pinned pages results in a 1.5% decrease in memory usage, and we can infer from the non-overlapping error bars that this is likely to be statistically significant. 

 ## Conclusion 

 Re-introducing heap sorting by pinned slots prior to compaction improves memory usage of a Ruby process by a statistically significant amount. It does this at a very slight, and likely insignificant performance penalty, which we believe to be acceptable given that GC compaction is currently a fairly major "stop the world" event that should only be done occasionally. For example, before forking. 

Back