Reputation: 163
I want to calculate number of mantissa bits in float and double. I know those numbers should be 23 and 52, but I have to calculate it in my program.
Upvotes: 0
Views: 2054
Reputation: 144949
There is an ambiguity in number of mantissa bits: it could be
Typically, the mantissa as stored in the IEEE floating point format does not include the initial 1
that is implied for all regular non zero numbers. Therefore the number of bits in the representation is one less that the true number of bits.
You can compute this number for the binary floating point formats in different ways:
FLT_MANT_BITS
, DBL_MANT_BITS
and LDBL_MANT_BITS
. The value is the true number of mantissa bits.FLT_EPSILON
defined in <float.h>
: FLT_EPSILON
is the smallest float value such that 1.0f + FLT_EPSILON
is different from 1.0f
. The true number of mantissa is 1 - log(FLT_EPSILON) / log(2)
. The same formula can be used for other floating point formats.Here is a test utility:
#include <float.h>
#include <math.h>
#include <stdio.h>
int main(void) {
int n;
float f = 1.0;
for (n = 0; 1.0f + f != 1.0f; n++) {
f /= 2;
}
#ifdef FLT_MANT_BITS
printf("#define FLT_MANT_BITS %d\n", FLT_MANT_BITS);
#endif
#ifdef FLT_EPSILON
printf("1 - log(FLT_EPSILON)/log(2) = %g\n", 1 - log(FLT_EPSILON) / log(2));
#endif
printf("Mantissa bits for float: %d\n", n);
double d = 1.0;
for (n = 0; 1.0 + d != 1.0; n++) {
d /= 2;
}
#ifdef DBL_MANT_BITS
printf("#define DBL_MANT_BITS %d\n", DBL_MANT_BITS);
#endif
#ifdef DBL_EPSILON
printf("1 - log(DBL_EPSILON)/log(2) = %g\n", 1 - log(DBL_EPSILON) / log(2));
#endif
printf("Mantissa bits for double: %d\n", n);
long double ld = 1.0;
for (n = 0; 1.0 + ld != 1.0; n++) {
ld /= 2;
}
#ifdef LDBL_MANT_BITS
printf("#define LDBL_MANT_BITS %d\n", LDBL_MANT_BITS);
#endif
#ifdef LDBL_EPSILON
printf("1 - log(LDBL_EPSILON)/log(2) = %g\n", 1 - log(LDBL_EPSILON) / log(2));
#endif
printf("Mantissa bits for long double: %d\n", n);
return 0;
}
Output on my laptop:
1 - log(FLT_EPSILON)/log(2) = 24
Mantissa bits for float: 24
1 - log(DBL_EPSILON)/log(2) = 53
Mantissa bits for double: 53
1 - log(LDBL_EPSILON)/log(2) = 64
Mantissa bits for long double: 64
Upvotes: 3
Reputation: 14705
There are constants you can use defined in the header <cfloat>
See FLT_MANT_DIG for example.
Upvotes: 5