Reputation: 689
Consider the following piece of C++ Code:
#include <iostream>
#include <cmath>
using namespace std;
int main()
{
cout.precision(1000000000);
float a,b,c;
a = 1;
b = -1;
c = pow(2, -50);
cout << "a = " << a << endl;
cout << "b = " << b << endl;
cout << "c = " << c << endl;
float ab = a + b;
float bc = b + c;
float abc = ab + c;
float bca = bc + a;
cout << "a + b = " << ab << endl;
cout << "b + c = " << bc << endl;
cout << "(a + b) + c = " << abc << endl;
cout << "(b + c) + a = " << bca << endl;
return 0;
}
Which yields the output:
a = 1
b = -1
c = 8.8817841970012523233890533447265625e-16
a + b = 0
b + c = -1
(a + b) + c = 8.8817841970012523233890533447265625e-16
(b + c) + a = 0
Why is b + c = -1?
I am not getting my head around this effect of the IEEE 754 standard.
To my understanding the exponent ranges from -126 to 127. (8 bit for the biased exponent with a bias of 127.)
So 2^(-50) is representable without an issue as is 1 or -1. Neither of them are subnormal (denormalized) numbers, if I understand the standard correctly.
But why does the addition of -1 + 2^(-50) result in -1, thus the smaller number being neglected?
Thanks in advance for any help!
Upvotes: 0
Views: 1090
Reputation: 1293
The IEEE 754 standard specifies 1 sign bit, 7 exponent bits and 24 bits for the mantissa. When performing addition, the mantissas of each number get normalized, so 2^-50 is 1 shifted right by 50 bits relative to 1. This causes it to fall outside of the 24 bit mantissa used for the result. You should try repeating your experiment with 2^-25 to prove this.
Upvotes: 2
Reputation: 12507
You are using float
which is (at least) single precision. Use double
instead.
And -1+9e-16
is within roundoff of -1
in single precision.
Upvotes: 0