Project

General

Profile

Feature #18498

Updated by byroot (Jean Boussier) almost 3 years ago

This is a clean take on #16038  

 ### Spec 

 #### Weak keys only 

 After a chat with @eregon we believe that what would make the most sense and would be most useful would be 
 a "WeakKeysMap". Meaning **the keys would be weak references, but the values would be strong references**. 
 This behavior is also consistent with the terminology in other popular languages such as Javascript and Java were `WeakMap` only have weak keys. 

 By default **it would use equality semantic, just like a regular `Hash`**. Having an option to compare by indentity instead 
 could be useful, but not strictly required. 

 #### Immediate objects support 

 Many WeakMap implementation in other languages don't accept "immediate" objects (or their equivalent) in the weak slots. 
 This is because since they are never collected, the weak reference will never possibly expire. 

 `ObjectSpace::WeakMap` currently accept them, but since both keys and values are weak references, there is legitimate use. 

 However in a `WeakKeysMap`, using an immediate object as key should likely raise a `TypeError`. 

 What is less clear is wether `BIGNUM` (allocated `Integer`) and dynamic symbols (allocated `Symbol`) should be accepted. 

 I believe they shouldn't for consistency, so: 

   - No immediate objects 
   - No `Integer` 
   - No `Symbol` 

 #### `member` method to lookup an existing key 

 For some use case, notably deduplication sets, a `member` method that maps `st_get_key` would be useful. 

 ```ruby 
 def member(key) -> existing key or nil 
 ``` 

 Not sure if `member` is the best possible name, but `#key` is already used for looking up a key by value. 

 That same method could be useful on `Hash` and `Set` as well, but that would be a distinct feature request. 

 #### Naming 

 Possible names: 

   - `::WeakMap` 
   - `::WeakKeysMap` 
   - `::WeakRef::Map` 
   - `::WeakRef::WeakMap` 
   - `::WeakRef::WeakKeysMap` 
   - `::WeakRef::WeakHash` 
   - `::WeakRef::WeakKeysHash` 

 My personal, ligthly held, preference goes toward `::WeakRef::WeakKeysMap`. 

 ### Use cases 


 

 #### Deduplicating constructors. 

 Can be used by large value object classes to automatically re-use existing instances: 

 ```ruby 
 class SomeValueObject 
   CACHE = WeakKeysMap.new 

   def self.build(attributes) 
     CACHE[attributes] ||= new(attributes).freeze # The instance hold the reference 
   end 

   def initialize(attributes) 
     @attributes = attributes 
     ... # compute some expensive or large data 
   end 

   ... 
 end 
 ``` 

 #### WeakSet 

 A `WeakKeysMap` would be a good enough primitive to implement a `WeakSet` in pure Ruby code, just like `Set` is 
 implemented with a `Hash`. 

 `WeakSet` are useful for use cases such as avoiding cycles in an object graph without holding strong references. 

 #### Deduplication sets 

 Assuming `WeakMap` have the `#member` method. 

 A small variation on the "deduplicating constructors" use case, better suited for smaller but numerous objects. 
 The difference here is that we first build the object and then lookup for a pre-existing one. This is the 
 strategy used to [deduplicate Active Record schema metadata](https://github.com/rails/rails/blob/3be590edbedab8ddcacdf72790d50c3cf9354434/activerecord/lib/active_record/connection_adapters/deduplicable.rb#L5). 


 ```ruby 
 class DeduplicationSet 
   def initialize 
     @set = WeakKeysMap.new 
   end 

   def dedup(object) 
     if existing_object = @set.member(object) 
       existing_object 
     else 
       @set[object] = true 
       object 
     end 
   end 
 end 
 ``` 

 #### Third party object extension 

 When you need to record some associated data on third party objects for which you don't control the lifetime. 
 A WeakMap can be used: 

 ```ruby 
 METADATA = WeakKeysMap.new 
 def do_something(third_party_object) 
   metadata = (METADATA[third_party_object] ||= Metadata.new) 
   metadata.foo = "bar" 
 end 
 ``` 

Back