How to correct numerical error in numpy sum

Question

I'm trying to return a vector (1-d numpy array) that has a sum of 1. The key is that it has to equal 1.0 as it represents a percentage. However, there seems to be a lot of cases where the sum does not equal to 1 even when I divided each element by the total. In other words, the sum of 'x' does not equal to 1.0 even when x = x'/sum(x')

One of the cases where this occurred was the vector below.

x = np.array([0.090179377557090171, 7.4787182000074775e-05, 0.52465058646452456, 1.3594135000013591e-05, 0.38508165466138505])

The summation of this vector x.sum() is 1.0000000000000002 whereas the summation of the vector that is divided by this value is 0.99999999999999978. From that point on that reciprocates.

What I did do was round the elements in the vector by the 10th decimal place (np.round(x, decimals = 10)) then divided this by the sum which results in a sum of exactly 1.0. This works when I know the size of the numerical error. Unfortunately, that would not be the case in usual circumstances.

I'm wondering if there is a way to correct the numerical error of only when the vector is known so that the sum will equal to 1.0.

Edit: Is floating point math broken? This question doesn't answer my question as it states only 'why' the difference occurs and not how to resolve the issue.

Misza · Accepted Answer

A bit of a hacky solution:

x[-1] = 0
x[-1] = 1 - x.sum()

Essentially shoves the numerical errors into the last element of the array. (No roundings beforehand are needed.)

Note: A mathematically simpler solution:

x[-1] = 1.0 - x[:-1].sum()

does not work, due to different behavior of numpy.sum on whole array vs a slice.

How to correct numerical error in numpy sum

Answers (1)

Related Questions