Brian Waters
Brian Waters

Reputation: 629

Division of two floats giving incorrect answer

Attempting to divide two floats in C, using the code below:

#include <stdio.h>
#include <math.h>

int main(){
  float fpfd = 122.88e6;
  float flo = 10e10;
  float int_part, frac_part;

  int_part = (int)(flo/fpfd);
  frac_part = (flo/fpfd) - int_part;

  printf("\nInt_Part = %f\n", int_part);
  printf("Frac_Part = %f\n", frac_part);

  return(0);
}

To this code, I use the commands:

>> gcc test_prog.c -o test_prog -lm
>> ./test_prog

I then get this output:

Int_Part = 813.000000
Frac_Part = 0.802063

Now, this Frac_part it seems is incorrect. I have tried the same equation on a calculator first and then in Wolfram Alpha and they both give me:

Frac_Part = 0.802083

Notice the number at the fifth decimal place is different.

This may seem insignificant to most, but for the calculations I am doing it is of paramount importance.

Can anyone explain to me why the C code is making this error?

Upvotes: 3

Views: 5216

Answers (3)

Chris Beck
Chris Beck

Reputation: 16224

When you have inadequate precision from floating point operations, the first most natural step is to just use floating point types of higher precision, e.g. use double instead of float. (As pointed out immediately in the other answers.)

Second, examine the different floating point operations and consider their precisions. The one that stands out to me as being a source of error is the method above of separating a float into integer part and fractional part, by simply casting to int and subtracting. This is not ideal, because, when you subtract the integer part from the original value, you are doing arithmetic where the three numbers involved (two inputs and result) have very different scales, and this will likely lead to precision loss.

I would suggest to use the C <math.h> function modf instead to split floating point numbers into integer and fractional part. http://www.techonthenet.com/c_language/standard_library_functions/math_h/modf.php

(In greater detail: When you do an operation like f - (int)f, the floating point addition procedure is going to see that two numbers of some given precision X are being added, and it's going to naturally assume that the result will also have precision X. Then it will perform the actual computation under that assumption, and finally reevaluate the precision of the result at the end. Because the initial prediction turned out not to be ideal, some low order bits are going to get lost.)

Upvotes: 5

Yu Hao
Yu Hao

Reputation: 122493

float has only 6~9 significant digits, it's not precise enough for most uses in practice. Changing all float variables to double (which provides 15~17 significant digits) gives output:

Int_Part = 813.000000
Frac_Part = 0.802083

Upvotes: 2

ex0ns
ex0ns

Reputation: 1116

Float are single precision for floating point, you should instead try to use double, the following code give me the right result:

#include <stdio.h>
#include <math.h>

int main(){
  double fpfd = 122.88e6;
  double flo = 10e10;
  double int_part, frac_part;

  int_part = (int)(flo/fpfd);
  frac_part = (flo/fpfd) - int_part;

  printf("\nInt_Part = %f\n", int_part);
  printf("Frac_Part = %f\n", frac_part);

  return(0);
}

Why ?

As I said, float are single precision floating point, they are smaller than double (in most architecture, sizeof(float) < sizeof(double)). By using double instead of float you will have more bit to store the mantissa and the exponent part of the number (see wikipedia).

Upvotes: 3

Related Questions