Kingsman54321
Kingsman54321

Reputation: 17

Why is the result of the C program showing a different result than expected?

When I run this program:

#include <stdio.h>
int main (void)
{
    float x;
    double y;
    x = - 2147483645.0;
    y = -2147483645.0f; 
    printf("%f, %f", x, y);
return 0;
}

the result is -2147483648.000000, -2147483645.000000

Why is it so?`

Upvotes: 0

Views: 145

Answers (2)

fcdt
fcdt

Reputation: 2493

The value 2147483645.0 would be 1.111111111111111111111111111101∙2³⁰ in binary form, so it needs a 30 bit mantissa. But the float data type offers only a 23 bit mantissa while double has around 52 bits. The sign is saved separately (this also depends on your plattform and your compiler, this values are for standard x86).

Consider this program:

#include <stdio.h>

int main() {
  float x = -2147483645.0;
  double y = -2147483645.0;

  printf("%f  %X\n", x, *((unsigned*) &x));
  printf("%f  %X%X\n", y, *( ((unsigned*) &y)+1), *((unsigned*) &y));
}

I compiled it with gcc 5.4.0 for x86 and get as output:

-2147483648.000000  CF000000
-2147483645.000000  C1DFFFFFFF400000

The internal format of the numbers in hexadecimal notation can be seen on the right:

float x (32 bits in total):
===========================
Sign:     1
Exponent: 100 1111 0 (bias 127 + 31)
Mantissa: 000 0000 0000 0000 0000 0000

double y (64 bits in total):
============================
Sign:     1
Exponent: 100 0001 1111 (bias 1023 + 32)
Mantissa: 1111 1111 1111 1111 1111 1111 1111 0100 0000 0000 0000 0000 0000

I have grouped the numbers here as in the output. The double y stores exactly the binary representation of the number as described above. In contrast, the mantissa is zero for the float x. This is because the bits are not simply cut off. Instead, the value is rounded depending on the excess bits. That's why you got 1.0∙2³¹=2147483648 as in the output.

You can also try this out on sites like these.

The rounding is done by the c preprocessor here. I don't know a way to influence this, but you can control the rounding mode within the programm, as mentioned here:

#include <stdio.h>
#include <fenv.h>
#pragma STDC FENV_ACCESS ON

int main() {
  float x;
  double y = -2147483645.0;
  
  fesetround(FE_TONEAREST);
  x = y;
  printf("FE_TONEAREST:  %f %X\n", x, *((unsigned*) &x));
  
  fesetround(FE_UPWARD);
  x = y;
  printf("FE_UPWARD:     %f %X\n", x, *((unsigned*) &x));
  
  fesetround(FE_DOWNWARD);
  x = y;
  printf("FE_DOWNWARD:   %f %X\n", x, *((unsigned*) &x));
  
  fesetround(FE_TOWARDZERO);
  x = y;
  printf("FE_TOWARDZERO: %f %X\n", x, *((unsigned*) &x));
}

Compile with -lm option. This outputs

FE_TONEAREST:  -2147483648.000000 CF000000
FE_UPWARD:     -2147483520.000000 CEFFFFFF
FE_DOWNWARD:   -2147483648.000000 CF000000
FE_TOWARDZERO: -2147483520.000000 CEFFFFFF

Upvotes: 4

Aplet123
Aplet123

Reputation: 35482

Floating points are imprecise, and having a larger size means more precision. In this case, a double is large enough to precisely store the number, but a float isn't, which means it prints out as the wrong value. Read more here.

Upvotes: 4

Related Questions