Feature #17109
closedEliminate the performance impact of operands' order in array's | and &
Description
When I do intersection (a & b) or union (a | b), usually the array in one position is more likely longer than the one in the other position. I always try to remember in what order I should write a and b for the best performance. The right answers for current implementation are longer & shorter and longer | shorter.
Here I suggest to make it simpler and eliminate the difference.
Sorry I don't know C. Though I suppose my solution might be too blunt and verbose, I hope it demonstrates the idea.
make benchmark ITEM=array_union COMPARE_RUBY=~/.rbenv/versions/2.8.0-dev/bin/ruby
|                          |compare-ruby|built-ruby|
|:-------------------------|-----------:|---------:|
|small-&                   |      4.315M|    4.228M|
|                          |       1.02x|         -|
|small-intersection        |      4.157M|    4.106M|
|                          |       1.01x|         -|
|big-&                     |    107.258k|  132.051k|
|                          |           -|     1.23x|
|big-intersection          |    103.245k|  128.052k|
|                          |           -|     1.24x|
|big-reverse-intersection  |     96.544k|  159.201k|
|                          |           -|     1.65x|
My own test shows 20-30% perf improvement of Array#union.
        
           Updated by andrey.samsonov@gmail.com (Andrey Samsonov) about 5 years ago
          Updated by andrey.samsonov@gmail.com (Andrey Samsonov) about 5 years ago
          
          
        
        
      
      - Subject changed from Eliminate the performance impact of operands in array's | and & to Eliminate the performance impact of operands order in array's | and &
        
           Updated by sawa (Tsuyoshi Sawada) about 5 years ago
          Updated by sawa (Tsuyoshi Sawada) about 5 years ago
          
          
        
        
      
      a & b returns different result from b & a. a | b returns different result from b | a. You cannot substitute one with another.
Furthermore, your idea is implementation-dependent.
It does not make sense at all.
        
           Updated by sawa (Tsuyoshi Sawada) about 5 years ago
          Updated by sawa (Tsuyoshi Sawada) about 5 years ago
          
          
        
        
      
      - Tracker changed from Bug to Feature
- Subject changed from Eliminate the performance impact of operands order in array's | and & to Eliminate the performance impact of operands' order in array's | and &
- Description updated (diff)
- ruby -v deleted (2.8)
- Backport deleted (2.5: UNKNOWN, 2.6: UNKNOWN, 2.7: UNKNOWN)
        
           Updated by greggzst (Grzegorz Jakubiak) about 5 years ago
          Updated by greggzst (Grzegorz Jakubiak) about 5 years ago
          
          
        
        
      
      I mean these operations essentially perform Set union and intersection, am I correct? Then the order of the operands doesn't affect the outcome.
        
           Updated by mame (Yusuke Endoh) about 5 years ago
          Updated by mame (Yusuke Endoh) about 5 years ago
          
          
        
        
      
      I like the idea very much, but very unfortunately, the document of Array#& says:
The order is preserved from the original array.
https://docs.ruby-lang.org/en/2.7.0/Array.html#method-i-26
Personally I agree with you that this is a set operation so the guarantee of the order looks strange, but anyway, this optimization brings a change of the spec.
        
           Updated by andrey.samsonov@gmail.com (Andrey Samsonov) about 5 years ago
          Updated by andrey.samsonov@gmail.com (Andrey Samsonov) about 5 years ago
          
          
        
        
      
      The order is preserved from the original array.
My bad, I missed the comment. In this case, I think the optimisation is not worth a change in the spec.
        
           Updated by Eregon (Benoit Daloze) about 5 years ago
          Updated by Eregon (Benoit Daloze) about 5 years ago
          
          
        
        
      
      set.rb has this optimization for intersection and IMHO Set should be used explicitly if you expect high-performance Set operations.
        
           Updated by Eregon (Benoit Daloze) about 5 years ago
          Updated by Eregon (Benoit Daloze) about 5 years ago
          
          
        
        
      
      - Status changed from Open to Rejected