Feature #19643
Updated by nekoyama32767 (Jinsong Yu) over 1 year ago
https://github.com/ruby/ruby/pull/7805
# Introduction
In most of case sort_by works on primitive type.
Using `qsort_r` or `qsort_s` with function pointer is much slower than compare data directly.
I implement a intro sort which compare primitive data directly for sort_by,
and check whether if given block produce primitive keys (`Fixnum`, `Float`, mixed `Fixnum` `Float`).
For example `array.sort_by {|x| x.to_i}` , `array.sort_by {|x| x.to_f}` and `[1.0, 2, 3.0, 4, 5.0].sort_by {|x| x.to_f}`
We can even afford a O(n) type check before primitive data sort.
It still go faster and even approximate to Array#sort when only primitive data need to be compared in default order.
Here is benchmark result for different length of array.
### Small array Bench mark
```
prelude: |
fixnum_array = (1..10).map {rand(10000)}
float_array = (1..10).map {rand(10000.0).to_f}
string_array = (1..10).map {"r" * rand(1..10000)}
mix_array = a = (1..10).map {if rand(1..100) <= 50 then rand(1..10000).to_f else rand(1..10000) end}
benchmark:
fixnum_array: fixnum_array.sort_by {|a| a}
float_array: float_array.sort_by {|a| a}
string_lengtxh: string_array.sort_by {|a| a.length}
mix_array: mix_array.sort_by {|a| a}
```
##### Iteration per second (i/s)
| |Direct Compare |Ruby master |
|:---------------|------------------------------------------------:|--------------------------------------------------:|
|fixnum_array | 1.081M| 895.466k|
| | 1.21x| -|
|float_array | 944.780k| 836.161k|
| | 1.13x| -|
|string_lengtxh | 1.070M| 897.508k|
| | 1.19x| -|
|mix_array | 992.567k| 568.079k|
| | 1.75x| -|
### Medium array Bench mark
```
prelude: |
fixnum_array = (1..10000).map {rand(10000)}
float_array = (1..10000).map {rand(10000.0).to_f}
string_array = (1..10000).map {"r" * rand(1..10000)}
mix_array = a = (1..10000).map {if rand(1..100) <= 50 then rand(1..10000).to_f else rand(1..10000) end}
benchmark:
fixnum_array: fixnum_array.sort_by {|a| a}
float_array: float_array.sort_by {|a| a}
string_lengtxh: string_array.sort_by {|a| a.length}
mix_array: mix_array.sort_by {|a| a}
```
##### Iteration per second (i/s)
| |Direct Compare |Ruby master |
|:---------------|------------------------------------------------:|--------------------------------------------------:|
|fixnum_array | 967.592| 491.893|
| | 1.97x| -|
|float_array | 535.514| 372.758|
| | 1.44x| -|
|string_lengtxh | 904.724| 512.948|
| | 1.76x| -|
|mix_array | 367.114| 186.771|
| | 1.97x| -|
### Large array Bench mark
```
prelude: |
fixnum_array = (1..1000000).map {rand(10000)}
float_array = (1..1000000).map {rand(10000.0).to_f}
string_array = (1..1000000).map {"r" * rand(1..10000)}
mix_array = a = (1..1000000).map {if rand(1..100) <= 50 then rand(1..10000).to_f else rand(1..10000) end}
benchmark:
fixnum_array: fixnum_array.sort_by {|a| a}
float_array: float_array.sort_by {|a| a}
string_lengtxh: string_array.sort_by {|a| a.length}
mix_array: mix_array.sort_by {|a| a}
```
##### Iteration per second (i/s)
| |Direct Compare |Ruby master |
|:---------------|------------------------------------------------:|--------------------------------------------------:|
|fixnum_array | 8.362| 5.386|
| | 1.55x| -|
|float_array | 4.251| 3.563|
| | 1.19x| -|
|string_lengtxh | 6.981| 4.654|
| | 1.50x| -|
|mix_array | 2.975| 1.691|
| | 1.76x| -|