Feature #20692
openRewrite Array#bsearch in Ruby
Description
inspired by https://bugs.ruby-lang.org/issues/20182
Proporsal¶
Rewrite Array#bsearch
class Array
  def bsearch
    return to_enum(:bsearch) { size } unless block_given?
    return nil if empty?
    low = 0
    high = size - 1
    mid = size / 2
    case yield(self[low])
    when true, false
      while low < high
        if yield(self[mid])
          high = mid
        else
          low = mid + 1
        end
        mid = low + (high - low) / 2
      end
      yield(self[low]) ? self[low] : nil
    else
      # Find-any mode
      while low <= high
        case yield(self[mid])
        when 0
          return self[mid]
        when 1
          high = mid - 1
        when -1
          low = mid + 1
        else
          raise TypeError, 'wrong argument type (must be -1, 0, 1, true, or false)'
        end
        mid = low + (high - low) / 2
      end
      nil
    end
  end
end
https://github.com/sebyx07/native_ruby/blob/master/lib/native_ruby/array/bsearch.rb
I also created this gem:
https://github.com/sebyx07/native_ruby/ - to load patches w/o depending on the ruby/master
Benchmark¶
ruby 3.3.3 (2024-06-12 revision f1c7b6f435) +YJIT [x86_64-linux]
Benchmark results sorted array (average over 10000000 iterations):
                           user     system      total        real
Original bsearch:     11.499511   0.000000  11.499511 ( 11.500056)
Native bsearch:        2.923645   0.003860   2.927505 (  2.927539)
Benchmark results shuffled (average over 10000000 iterations):
                           user     system      total        real
Original bsearch:      8.726749   0.000001   8.726750 (  8.727679)
Native bsearch:        3.073027   0.000006   3.073033 (  3.073231)
        
           Updated by sebyx07 (Sebastian Buza) about 1 year ago
          Updated by sebyx07 (Sebastian Buza) about 1 year ago
          
          
        
        
      
      - Description updated (diff)
        
           Updated by sebyx07 (Sebastian Buza) about 1 year ago
          Updated by sebyx07 (Sebastian Buza) about 1 year ago
          
          
        
        
      
      - Description updated (diff)
        
           Updated by sebyx07 (Sebastian Buza) about 1 year ago
          Updated by sebyx07 (Sebastian Buza) about 1 year ago
          
          
        
        
      
      - Description updated (diff)
        
           Updated by Hanmac (Hans Mackowiak) about 1 year ago
          Updated by Hanmac (Hans Mackowiak) about 1 year ago
          
          
        
        
      
      find_minimum_mode = finder || !finder isn't this always true?
        
           Updated by sebyx07 (Sebastian Buza) about 1 year ago
          Updated by sebyx07 (Sebastian Buza) about 1 year ago
          
          
        
        
      
      - Description updated (diff)
        
           Updated by sebyx07 (Sebastian Buza) about 1 year ago
          Updated by sebyx07 (Sebastian Buza) about 1 year ago
          
          
        
        
      
      ty Hanmac
        
           Updated by tenderlovemaking (Aaron Patterson) about 1 year ago
          Updated by tenderlovemaking (Aaron Patterson) about 1 year ago
          
          
        
        
      
      I think this is a good idea, but I think we need to make a few changes.  First, to_enum and block_given? are both method calls. It's possible for people to monkey patch Array and add those methods, but the original implementation of Array#bsearch would ignore such monkey patches.  To maintain existing behavior, we must use code that doesn't depend on those methods the same way the Array#each implementation works.
Additionally, I think this proposal has the same problem as the initial version of the Array#each implementation in Ruby, specifically a possible thread safety problem between the < comparison at the [] fetch.  I think we can address that by doing the same trick as Array#each.
        
           Updated by mame (Yusuke Endoh) about 1 year ago
          Updated by mame (Yusuke Endoh) about 1 year ago
          
          
        
        
      
      I tried make test-all TESTS=test/ruby/test_array.rb with the proposed implementation, but several tests failed. They should be solved before discussing performance.
[ 47/447] TestArray#test_bsearch_in_find_any_mode = 0.00 s
  1) Error:
TestArray#test_bsearch_in_find_any_mode:
TypeError: wrong argument type (must be -1, 0, 1, true, or false)
    <internal:array>:245:in 'Array#bsearch'
    /home/mame/work/ruby/test/ruby/test_array.rb:3360:in 'TestArray#test_bsearch_in_find_any_mode'
[174/447] TestArray#test_bsearch_typechecks_return_values = 0.00 s
  2) Failure:
TestArray#test_bsearch_typechecks_return_values [/home/mame/work/ruby/test/ruby/test_array.rb:3337]:
Expected Exception(TypeError) was raised, but the message doesn't match.
Expected /C\u{309a 26a1 26c4 1f300}/ to match "wrong argument type (must be -1, 0, 1, true, or false)".
[198/447] TestArray#test_bsearch_with_no_block = 0.00 s
  3) Failure:
TestArray#test_bsearch_with_no_block [/home/mame/work/ruby/test/ruby/test_array.rb:3345]:
Expected 5 to be nil.
[269/447] TestArraySubclass#test_bsearch_in_find_any_mode = 0.00 s
  4) Error:
TestArraySubclass#test_bsearch_in_find_any_mode:
TypeError: wrong argument type (must be -1, 0, 1, true, or false)
    <internal:array>:245:in 'Array#bsearch'
    /home/mame/work/ruby/test/ruby/test_array.rb:3360:in 'TestArray#test_bsearch_in_find_any_mode'
[396/447] TestArraySubclass#test_bsearch_typechecks_return_values = 0.00 s
  5) Failure:
TestArraySubclass#test_bsearch_typechecks_return_values [/home/mame/work/ruby/test/ruby/test_array.rb:3337]:
Expected Exception(TypeError) was raised, but the message doesn't match.
Expected /C\u{309a 26a1 26c4 1f300}/ to match "wrong argument type (must be -1, 0, 1, true, or false)".
[420/447] TestArraySubclass#test_bsearch_with_no_block = 0.00 s
  6) Failure:
TestArraySubclass#test_bsearch_with_no_block [/home/mame/work/ruby/test/ruby/test_array.rb:3345]:
Expected 5 to be nil.
I am also concerned about the performance when YJIT is not enabled.