Reputation: 77
I need to perform a simple multiplication of 400 * 256.3. The result is 102520. Straight forward and simple. But to implement this multiplication in C++ (or C) is a little tricky and confusing to me.
I understand floating point number is not represented as it is in computer. I wrote the code to illustrate the situation. Output is attached too.
So, if I do the multiplication using float type variable, I am subjected to rounding error. Using double type variable would have avoided the problem. But let's say I have a very limited resource on the embedded system and I have to optimize the variable type to the very best I could, how can I perform the multiplication using float type variable and not susceptible to rounding error?
I knew the floating point math done by computer is not broken at all. But I am curious for best practice to perform floating point math. 256.3 is just a value for illustration. I would not know what floating point value I will get during runtime. But it is for sure, a floating point value.
int main()
{
//perform 400 * 256.3
//result should be 102520
float floatResult = 0.00f;
int intResult = 0;
double doubleResult = 0.00;
//float = int * float
floatResult = 400 * 256.3f;
printf("400 * 256.3f = (float)->%f\n", floatResult);
//float = float * float
floatResult = 400.00f * 256.3f;
printf("400.00f * 256.3f = (float)->%f\n", floatResult);
printf("\n");
//int = int * float
intResult = 400 * 256.3f;
printf("400 * 256.3f = (int)->%d\n", intResult);
//int = float * float;
intResult = 400.00f * 256.3f;
printf("400.00f * 256.3f = (int)->%d\n", intResult);
printf("\n");
//double = double * double
doubleResult = 400.00 * 256.3;
printf("400.00 * 256.3 = (double)->%f\n", doubleResult);
//int = double * double;
intResult = 400.00 * 256.3;
printf("400.00 * 256.3 = (int)->%d\n", intResult);
printf("\n");
//double = int * double
doubleResult = 400 * 256.3;
printf("400 * 256.3 = (double)->%f\n", doubleResult);
//int = int * double
intResult = 400 * 256.3;
printf("400 * 256.3 = (int)->%d\n", intResult);
printf("\n");
//will double give me rounding error?
if (((400.00 * 256.3) - 102520) != 0) {
printf("Double give me rounding error!\n");
}
//will float give me rounding error?
if (((400.00f * 256.3f) - 102520) != 0) {
printf("Float give me rounding error!\n");
}
return 0;
}
Upvotes: 6
Views: 17436
Reputation: 153338
A key weakness to displaying the problem is the conversion to int intResult
. The posted problem is about multiplying and comparing, but code only shows issues surrounding int
conversion.
If code needs to convert a FP value to the nearest whole number, uses rint()
, round()
, nearbyint()
or lround()
, not integer assignment.
Upvotes: 2
Reputation: 47923
First of all, understand that type double
has all the same problems as type float
. Neither type has infinite precision, so both types are susceptible to precision loss and other problems.
As to what you can do: there are many different problems that come up, depending on what you're doing, and many techniques to overcome them. Many, many words have been written on these techniques; I suggest doing a web search on "avoiding floating point error". But the basic points are:
See also https://www.eskimo.com/~scs/cclass/handouts/sciprog.html .
Upvotes: 5
Reputation: 6436
If you have a fixed number of decimal digits (1 in the case of 256.3
) as well as a bounded range of the results, you can use integer multiplication, and adjust for the shift in decimal digits through integer division:
int result = (400 * 2563) / 10;
Rounding errors are inherent to floating point arithmetics, except for a few cases where all operands can be represented exactly. Whether you choose float
or double
just influences when the error occurs, not if.
Upvotes: 6