Project

General

Profile

Actions

Feature #20692

open

Rewrite Array#bsearch in Ruby

Added by sebyx07 (Sebastian Buza) 2 months ago. Updated 2 months ago.

Status:
Open
Assignee:
-
Target version:
-
[ruby-core:118930]

Description

inspired by https://bugs.ruby-lang.org/issues/20182

Proporsal

Rewrite Array#bsearch

class Array
  def bsearch
    return to_enum(:bsearch) { size } unless block_given?

    return nil if empty?

    low = 0
    high = size - 1
    mid = size / 2

    case yield(self[low])
    when true, false
      while low < high
        if yield(self[mid])
          high = mid
        else
          low = mid + 1
        end
        mid = low + (high - low) / 2
      end
      yield(self[low]) ? self[low] : nil
    else
      # Find-any mode
      while low <= high
        case yield(self[mid])
        when 0
          return self[mid]
        when 1
          high = mid - 1
        when -1
          low = mid + 1
        else
          raise TypeError, 'wrong argument type (must be -1, 0, 1, true, or false)'
        end
        mid = low + (high - low) / 2
      end
      nil
    end
  end
end

https://github.com/sebyx07/native_ruby/blob/master/lib/native_ruby/array/bsearch.rb

I also created this gem:
https://github.com/sebyx07/native_ruby/ - to load patches w/o depending on the ruby/master

Benchmark

ruby 3.3.3 (2024-06-12 revision f1c7b6f435) +YJIT [x86_64-linux]

Benchmark results sorted array (average over 10000000 iterations):
                           user     system      total        real
Original bsearch:     11.499511   0.000000  11.499511 ( 11.500056)
Native bsearch:        2.923645   0.003860   2.927505 (  2.927539)


Benchmark results shuffled (average over 10000000 iterations):
                           user     system      total        real
Original bsearch:      8.726749   0.000001   8.726750 (  8.727679)
Native bsearch:        3.073027   0.000006   3.073033 (  3.073231)
Actions #1

Updated by sebyx07 (Sebastian Buza) 2 months ago

  • Description updated (diff)
Actions #2

Updated by sebyx07 (Sebastian Buza) 2 months ago

  • Description updated (diff)
Actions #3

Updated by sebyx07 (Sebastian Buza) 2 months ago

  • Description updated (diff)

Updated by Hanmac (Hans Mackowiak) 2 months ago

find_minimum_mode = finder || !finder isn't this always true?

Actions #5

Updated by sebyx07 (Sebastian Buza) 2 months ago

  • Description updated (diff)

Updated by tenderlovemaking (Aaron Patterson) 2 months ago

I think this is a good idea, but I think we need to make a few changes. First, to_enum and block_given? are both method calls. It's possible for people to monkey patch Array and add those methods, but the original implementation of Array#bsearch would ignore such monkey patches. To maintain existing behavior, we must use code that doesn't depend on those methods the same way the Array#each implementation works.

Additionally, I think this proposal has the same problem as the initial version of the Array#each implementation in Ruby, specifically a possible thread safety problem between the < comparison at the [] fetch. I think we can address that by doing the same trick as Array#each.

Updated by mame (Yusuke Endoh) 2 months ago

I tried make test-all TESTS=test/ruby/test_array.rb with the proposed implementation, but several tests failed. They should be solved before discussing performance.

[ 47/447] TestArray#test_bsearch_in_find_any_mode = 0.00 s
  1) Error:
TestArray#test_bsearch_in_find_any_mode:
TypeError: wrong argument type (must be -1, 0, 1, true, or false)
    <internal:array>:245:in 'Array#bsearch'
    /home/mame/work/ruby/test/ruby/test_array.rb:3360:in 'TestArray#test_bsearch_in_find_any_mode'

[174/447] TestArray#test_bsearch_typechecks_return_values = 0.00 s
  2) Failure:
TestArray#test_bsearch_typechecks_return_values [/home/mame/work/ruby/test/ruby/test_array.rb:3337]:
Expected Exception(TypeError) was raised, but the message doesn't match.
Expected /C\u{309a 26a1 26c4 1f300}/ to match "wrong argument type (must be -1, 0, 1, true, or false)".

[198/447] TestArray#test_bsearch_with_no_block = 0.00 s
  3) Failure:
TestArray#test_bsearch_with_no_block [/home/mame/work/ruby/test/ruby/test_array.rb:3345]:
Expected 5 to be nil.

[269/447] TestArraySubclass#test_bsearch_in_find_any_mode = 0.00 s
  4) Error:
TestArraySubclass#test_bsearch_in_find_any_mode:
TypeError: wrong argument type (must be -1, 0, 1, true, or false)
    <internal:array>:245:in 'Array#bsearch'
    /home/mame/work/ruby/test/ruby/test_array.rb:3360:in 'TestArray#test_bsearch_in_find_any_mode'

[396/447] TestArraySubclass#test_bsearch_typechecks_return_values = 0.00 s
  5) Failure:
TestArraySubclass#test_bsearch_typechecks_return_values [/home/mame/work/ruby/test/ruby/test_array.rb:3337]:
Expected Exception(TypeError) was raised, but the message doesn't match.
Expected /C\u{309a 26a1 26c4 1f300}/ to match "wrong argument type (must be -1, 0, 1, true, or false)".

[420/447] TestArraySubclass#test_bsearch_with_no_block = 0.00 s
  6) Failure:
TestArraySubclass#test_bsearch_with_no_block [/home/mame/work/ruby/test/ruby/test_array.rb:3345]:
Expected 5 to be nil.

I am also concerned about the performance when YJIT is not enabled.

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0Like0