lmeninato
lmeninato

Reputation: 462

Why are my list structure math operations off by a couple billionths?

I have some pretty simple function that tries to return a list that is the distance between the inputted list and the average of that list. The code almost works. Any thoughts as to why the results are slightly off?

def distances_from_average(test_list):
    average = [sum(test_list)/float(len(test_list))]*len(test_list)
    return [x-y for x,y in zip(test_list, average)]

Here are my example results: [-4.200000000000003, 35.8, 2.799999999999997, -23.200000000000003, -11.200000000000003] should equal [4.2, -35.8, -2.8, 23.2, 11.2]

Upvotes: 3

Views: 84

Answers (2)

dawg
dawg

Reputation: 104032

Floating point can leading to surprising results if you do not fully consider how floating point is represented in binary and the rounding errors that can result. Floating point rounding errors are exacerbated in series summation.

Examples:

>>> sum([.1]*10)
0.9999999999999999   # 1.0 expected in decimal
>>> sum([.1]*1000)
99.9999999999986     # 100.0 expected
>>> sum([1, 1e100, 1, -1e100] * 10000)
0.0                  # 20000 expected

There are numerous ways to get results that are exact (using the Decimal module, using the Fractions module, etc) Among other techniques, rounding errors can be eliminated by cancelation during summation. You can use fsum from the Python math library for more exact results with summation:

>>> import math
>>> math.fsum([.1]*10)
1.0
>>> math.fsum([.1]*1000)
100.0
>>> math.fsum([1, 1e100, 1, -1e100] * 10000)
20000.0

The fsum function is based on Raymond Hettinger's Active State recipes. It is not perfect (try math.fsum([1.1,2.2]*1000)...) but it is pretty good.

Upvotes: 2

rscarson
rscarson

Reputation: 262

This is due to the way computers represent floating point numbers.

They are not always accurate in the way you expect, and thus should not be used to check equality, or represent things like amounts of money.

How are these numbers being used? If you need that kind of accuracy perhaps there are better ways to use the information, like checking for a range instead of checking equality.

Here is some good reading material on the subject

Upvotes: 7

Related Questions