Reputation: 462
I have some pretty simple function that tries to return a list that is the distance between the inputted list and the average of that list. The code almost works. Any thoughts as to why the results are slightly off?
def distances_from_average(test_list):
average = [sum(test_list)/float(len(test_list))]*len(test_list)
return [x-y for x,y in zip(test_list, average)]
Here are my example results: [-4.200000000000003, 35.8, 2.799999999999997, -23.200000000000003, -11.200000000000003] should equal [4.2, -35.8, -2.8, 23.2, 11.2]
Upvotes: 3
Views: 84
Reputation: 104032
Floating point can leading to surprising results if you do not fully consider how floating point is represented in binary and the rounding errors that can result. Floating point rounding errors are exacerbated in series summation.
Examples:
>>> sum([.1]*10)
0.9999999999999999 # 1.0 expected in decimal
>>> sum([.1]*1000)
99.9999999999986 # 100.0 expected
>>> sum([1, 1e100, 1, -1e100] * 10000)
0.0 # 20000 expected
There are numerous ways to get results that are exact (using the Decimal module, using the Fractions module, etc) Among other techniques, rounding errors can be eliminated by cancelation during summation. You can use fsum from the Python math library for more exact results with summation:
>>> import math
>>> math.fsum([.1]*10)
1.0
>>> math.fsum([.1]*1000)
100.0
>>> math.fsum([1, 1e100, 1, -1e100] * 10000)
20000.0
The fsum
function is based on Raymond Hettinger's Active State recipes. It is not perfect (try math.fsum([1.1,2.2]*1000)
...) but it is pretty good.
Upvotes: 2
Reputation: 262
This is due to the way computers represent floating point numbers.
They are not always accurate in the way you expect, and thus should not be used to check equality, or represent things like amounts of money.
How are these numbers being used? If you need that kind of accuracy perhaps there are better ways to use the information, like checking for a range instead of checking equality.
Here is some good reading material on the subject
Upvotes: 7