添加零时奇怪的 numpy.sum 行为

Weird numpy.sum behavior when adding zeros(添加零时奇怪的 numpy.sum 行为)
本文介绍了添加零时奇怪的 numpy.sum 行为的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着跟版网的小编来一起学习吧!

问题描述

我了解数学上等效的算术运算如何由于数值错误(例如,以不同顺序求和浮点数)而导致不同的结果.

I understand how mathematically-equivalent arithmentic operations can result in different results due to numerical errors (e.g. summing floats in different orders).

然而,令我惊讶的是,将零添加到 sum 会改变结果.我认为这始终适用于浮点数,无论如何:x + 0. == x.

However, it surprises me that adding zeros to sum can change the result. I thought that this always holds for floats, no matter what: x + 0. == x.

这是一个例子.我希望所有的行都完全为零.谁能解释一下为什么会这样?

Here's an example. I expected all the lines to be exactly zero. Can anybody please explain why this happens?

M = 4  # number of random values
Z = 4  # number of additional zeros
for i in range(20):
    a = np.random.rand(M)
    b = np.zeros(M+Z)
    b[:M] = a
    print a.sum() - b.sum()

-4.4408920985e-16
0.0
0.0
0.0
4.4408920985e-16
0.0
-4.4408920985e-16
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
2.22044604925e-16
0.0
4.4408920985e-16
4.4408920985e-16
0.0

MZ 的较小值似乎不会发生这种情况.

It seems not to happen for smaller values of M and Z.

我还确定了 a.dtype==b.dtype.

这里还有一个例子,它也展示了 python 的内置 sum 的行为符合预期:

Here is one more example, which also demonstrates python's builtin sum behaves as expected:

a = np.array([0.1,      1.0/3,      1.0/7,      1.0/13, 1.0/23])
b = np.array([0.1, 0.0, 1.0/3, 0.0, 1.0/7, 0.0, 1.0/13, 1.0/23])
print a.sum() - b.sum()
=> -1.11022302463e-16
print sum(a) - sum(b)
=> 0.0

我正在使用 numpy V1.9.2.

I'm using numpy V1.9.2.

推荐答案

简答:你看到了两者的区别

a + b + c + d

(a + b) + (c + d)

因为浮点数不准确所以不一样.

which because of floating point inaccuracies is not the same.

长答案: Numpy 将成对求和作为速度(它允许更容易矢量化)和舍入误差的优化.

Long answer: Numpy implements pair-wise summation as an optimization of both speed (it allows for easier vectorization) and rounding error.

numpy sum-implementation 可以在 here(函数pairwise_sum_@TYPE@).它基本上做了以下事情:

The numpy sum-implementation can be found here (function pairwise_sum_@TYPE@). It essentially does the following:

  1. 如果数组的长度小于 8,则执行常规的 for 循环求和.这就是为什么如果 W < 没有观察到奇怪的结果.4 在您的情况下 - 在两种情况下都将使用相同的 for 循环求和.
  2. 如果长度在 8 到 128 之间,则在 8 个 bin r[0]-r[7] 中累加总和,然后通过 ((r[0] + r[1]) + (r[2] + r[3])) + ((r[4] + r[5]) + (r[6] + r[7])).
  3. 否则,它将递归地对数组的两半求和.
  1. If the length of the array is less than 8, a regular for-loop summation is performed. This is why the strange result is not observed if W < 4 in your case - the same for-loop summation will be used in both cases.
  2. If the length is between 8 and 128, it accumulates the sums in 8 bins r[0]-r[7] then sums them by ((r[0] + r[1]) + (r[2] + r[3])) + ((r[4] + r[5]) + (r[6] + r[7])).
  3. Otherwise, it recursively sums two halves of the array.

因此,在第一种情况下,您会得到 a.sum() = a[0] + a[1] + a[2] + a[3] 而在第二种情况下 b.sum() = (a[0] + a[1]) + (a[2] + a[3]) 这导致 a.sum() - b.sum() != 0.

Therefore, in the first case you get a.sum() = a[0] + a[1] + a[2] + a[3] and in the second case b.sum() = (a[0] + a[1]) + (a[2] + a[3]) which leads to a.sum() - b.sum() != 0.

这篇关于添加零时奇怪的 numpy.sum 行为的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!

本站部分内容来源互联网,如果有图片或者内容侵犯您的权益请联系我们删除!

相关文档推荐

patching a class yields quot;AttributeError: Mock object has no attributequot; when accessing instance attributes(修补类会产生“AttributeError:Mock object has no attribute;访问实例属性时)
How to mock lt;ModelClassgt;.query.filter_by() in Flask-SqlAlchemy(如何在 Flask-SqlAlchemy 中模拟 lt;ModelClassgt;.query.filter_by())
FTPLIB error socket.gaierror: [Errno 8] nodename nor servname provided, or not known(FTPLIB 错误 socket.gaierror: [Errno 8] nodename nor servname provided, or not known)
Why does the #39;int#39; object is not callable error occur when using the sum() function?(为什么在使用 sum() 函数时会出现 int object is not callable 错误?)
How to sum in pandas by unique index in several columns?(如何通过几列中的唯一索引对 pandas 求和?)
Django Aggregation: Sum return value only?(Django聚合:仅求和返回值?)