Reputation: 11
I was trying to demonstrate floating point error in python. For the following code:
import numpy as np
N = 100
a = b = np.float64(0.1)
for _ in range(N):
c = 3*a - 2*b
b = a
a = c
print(a)
The floating point error accumulates in every iteration, with the final output being:
35184372088832.07
But with 32 bit floats, the error does not accumulate. Initializing a and b as:
a = b = np.float32(0.1)
leads to the following output:
0.10000000149011612
a and b remain constant in every iteration. Because of type conversion, in the later case a, b, and c, get converted to float64 by 3rd iteration anyway, but why is there no error accumulation? What would the equivalent code be for C?
I tried to replicate the code in C, but I am not able to generate the case where a and b remain constant.
Upvotes: 1
Views: 551
Reputation: 280563
Your recurrence is designed to take a small rounding error in the initial 3*a-2*b
computation and amplify it. However, when you initially store a
and b
as float32, the only rounding error happens when rounding 0.1 to float32. There is no rounding error in the 3*a-2*b
computation.
Performing the 3*a-2*b
computation exactly only requires slightly more precision than a
and b
were initially rounded to. a
and b
were initially rounded to float32, but the computation is performed in float64 math, and float64 has way more precision than float32. There is no rounding error in the initial 3*a-2*b
computation, and thus, nothing for later iterations to amplify. Every iteration preserves the original values.
You would see a similar effect if you initially rounded a
and b
to float64, then performed the subsequent arithmetic in numpy.longdouble
math.
Upvotes: 2