Numpy array operation precision with very small values

Question

I've got some trouble with the precision of my array operations. I'm doing alot of array calculations where some cells of the array have to be left out, done either by masking or, in this case, by assigning very small values neary np.finfo(float).tiny to the array cells to leave out.
But during array operations this causes an error of around 1e-14 which is quite near to machine epsilon. But still I don't know where the error is coming from and how to avoid it. Since I perform these operations several million times, the errors stack up to a total error of around 2-3%.
Here is my minimum working example:

arr = np.arange(20).astype(float)
arr[0] = 1e-290
t1 = np.random.rand(20) * 100
t2 = np.random.rand(20) * 100
a = (arr * (t1 - t2)).sum()
b = (arr * (t1 - t2))[1:].sum()
d = (arr * (t1 - t2))[0].sum()
c = b - a
print(c)
# Out[99]: 4.5474735088646412e-13

To avoid this problem, I tried to mask arr:

arr_mask = np.ma.masked_where(arr < 1e-200, arr)
a_mask = (arr_mask * (t1 - t2)).sum()
b_mask = (arr_mask * (t1 - t2))[1:].sum()
c_mask = b_mask - a_mask
print(c_mask)
# Out[118]: 4.5474735088646412e-13

Why is the difference, c so many magnitudes bigger than d, which should be the difference? I guess some machine epsilon problem from assigning such a small value to the array in the first place? But still np.finfo(float).eps with 2.2204460492503131e-16 is around a 1000 times smaller than c.
How can I avoid this? Setting the elements to zero won't work, since I have lots of divisions. In this case I can't use masking to several reasons. BUT the position of the cells which have to be left out does NEVER change. So can I somehow assign a "safe" value to these cells to leave them out while altering the result of the total array operations?
Thanks in advance!

Paul Panzer · Accepted Answer

The granularity of a given float type is not fixed but depends on the size of the value you are starting from. I encourage you to play with the numpy.nextafter function:

a = 1.5
>>> np.nextafter(a, -1)
1.4999999999999998
>>> a - np.nextafter(a, -1)
2.220446049250313e-16
>>> a = 1e20
>>> np.nextafter(a, -1)
9.999999999999998e+19
>>> a - np.nextafter(a, -1)
16384.0

This shows that the smallest positive difference you can obtain by subtracting some fp number from a depends on the how large a is.

You should now be able to work out what happens in your example

Numpy array operation precision with very small values

Answers (1)

Related Questions