Feature #12871
closedUsing the algorithm like math.fsum of Python for Array#sum
Description
Array#sum uses the Kahan's algorithm for Float values now. But it returns inaccurate value in some case like below.
# ruby 2.4.0-preview2
[10000000000.0, 0.0000000001, -10000000000.0].sum #=> 0.0 (expected 0.0000000001)
Python's math.fsum uses another algorithm. It saves all digits, and returns accurate value in such a case.
(See: https://github.com/python/cpython/blob/d267006f18592165ed97e0a9c2494d3bce25fc2b/Modules/mathmodule.c#L1087)
# python 3.5.2
from math import fsum
fsum([10000000000.0, 0.0000000001, -10000000000.0]) #=> 0.0000000001
I propose to use the algorithm like math.fsum of Python for Array#sum.
This is an example implementation in Ruby.
class Array
# This implementation does not consider non float values
def sum
partials = []
each do |x|
i = 0
partials.each do |y|
x, y = y, x if x.abs < y.abs
hi = x + y # upper bits
lo = y - (hi - x) # lower bits (lost)
if lo != 0.0
partials[i] = lo
i += 1
end
x = hi
end
partials[i..-1] = [x]
end
partials.inject(0.0, :+)
end
end
Updated by mrkn (Kenta Murata) almost 8 years ago
- Status changed from Open to Assigned
- Assignee set to mrkn (Kenta Murata)
Updated by t-nissie (Takeshi Nishimatsu) almost 8 years ago
A quick hack.
- Elongation (or reallocation) of the array of partials[] when nn exeeds NUM_PARTIALS.
- Tests.
- Name of this algorithm. Kahan-Babuska-Neumaier?
are required.
diff --git a/array.c b/array.c
index b99ab45..2b818bf 100644
--- a/array.c
+++ b/array.c
@@ -5688,6 +5688,8 @@ rb_ary_dig(int argc, VALUE *argv, VALUE self)
return rb_obj_dig(argc, argv, self, Qnil);
}
+#define NUM_PARTIALS 32 /* initial partials array size, on stack */
+
/*
* call-seq:
* ary.sum(init=0) -> number
@@ -5796,14 +5798,15 @@ rb_ary_sum(int argc, VALUE *argv, VALUE ary)
}
if (RB_FLOAT_TYPE_P(e)) {
- /* Kahan's compensated summation algorithm */
- double f, c;
+ /* ???'s compensated summation algorithm */
+ double f,partials[NUM_PARTIALS];
+ long ii, jj, nn;
- f = NUM2DBL(v);
- c = 0.0;
+ partials[0] = NUM2DBL(v);
+ nn=1;
goto has_float_value;
for (; i < RARRAY_LEN(ary); i++) {
- double x, y, t;
+ double x, y, tmp, hi, lo;
e = RARRAY_AREF(ary, i);
if (block_given)
e = rb_yield(e);
@@ -5819,10 +5822,28 @@ rb_ary_sum(int argc, VALUE *argv, VALUE ary)
else
goto not_float;
- y = x - c;
- t = f + y;
- c = (t - f) - y;
- f = t;
+ ii = 0;
+ for (jj=0; jj < nn; jj++) {
+ y = partials[jj];
+ if (fabs(x) < fabs(y)) {
+ tmp = x;
+ x = y;
+ y =tmp;
+ }
+ hi = x + y; /* upper bits */
+ lo = y - (hi - x); /* lower bits (lost) */
+ if (lo != 0.0) {
+ partials[ii] = lo;
+ ii += 1;
+ }
+ x = hi;
+ }
+ partials[ii] = x;
+ nn = ii+1;
+ }
+ f = 0.0;
+ for (i=0; i<nn; i++) {
+ f += partials[i];
}
return DBL2NUM(f);
Updated by t-nissie (Takeshi Nishimatsu) almost 8 years ago
Julia can do it, too.
julia> sum_kbn([1.0e10, 1.0e-10, -1.0e10])
1.0e-10
The source code is https://github.com/JuliaLang/julia/blob/master/base/reduce.jl .
Updated by mrkn (Kenta Murata) almost 8 years ago
- Status changed from Assigned to Closed
Applied in changeset r57001.
array.c, enum.c: change sum algorithm
-
array.c (rb_ary_sum): change the algorithm to Kahan-Babuska balancing
summation to be more precise.
[Feature #12871] [ruby-core:77771] -
enum.c (sum_iter, enum_sum): ditto.
-
test_array.rb, test_enum.rb: add an assertion for the above change.
Updated by mrkn (Kenta Murata) almost 8 years ago
Takeshi Nishimatsu wrote:
Julia can do it, too.
julia> sum_kbn([1.0e10, 1.0e-10, -1.0e10]) 1.0e-10
The source code is https://github.com/JuliaLang/julia/blob/master/base/reduce.jl .
Thank you for pointing the information.
I referred the paper written by A. Klein [1], and employed the algorithm in that paper.
It is the same algorithm of sum_kbn in Julia.
[1] Klein, A. Computing (2006) 76: 279. http://link.springer.com/article/10.1007/s00607-005-0139-x
Updated by labocho (Keisuke NISHI) almost 8 years ago
Thank you.
Julia's algorithm looks good. But in some case, Python's algorithm is still better than that.
# julia 0.5.0
sum_kbn([1.0e100, 1.0, 1.0e-100, -1.0, -1.0e100]) # => 0.0
# python 3.5.2
from math import fsum
fsum([1.0e100, 1.0, 1.0e-100, -1.0, -1.0e100]) # => 1e-100
```
Is it acceptable?