Best practices for float multiplication in C++ or C?

I need to perform a simple multiplication of 400 * 256.3. The result is 102520. Straight forward and simple. But to implement this multiplication in C++ (or C) is a little tricky and confusing to me.

I understand floating point number is not represented as it is in computer. I wrote the code to illustrate the situation. Output is attached too.

So, if I do the multiplication using float type variable, I am subjected to rounding error. Using double type variable would have avoided the problem. But let's say I have a very limited resource on the embedded system and I have to optimize the variable type to the very best I could, how can I perform the multiplication using float type variable and not susceptible to rounding error?

I knew the floating point math done by computer is not broken at all. But I am curious for best practice to perform floating point math. 256.3 is just a value for illustration. I would not know what floating point value I will get during runtime. But it is for sure, a floating point value.

int main()
{
    //perform 400 * 256.3
    //result should be 102520

    float floatResult = 0.00f;
    int intResult = 0;
    double doubleResult = 0.00;

    //float = int * float
    floatResult = 400 * 256.3f;
    printf("400 * 256.3f = (float)->%f\n", floatResult);

    //float = float * float
    floatResult = 400.00f * 256.3f;
    printf("400.00f * 256.3f = (float)->%f\n", floatResult);

    printf("\n");

    //int = int * float
    intResult = 400 * 256.3f;
    printf("400 * 256.3f = (int)->%d\n", intResult);

    //int = float * float;
    intResult = 400.00f * 256.3f;
    printf("400.00f * 256.3f = (int)->%d\n", intResult);

    printf("\n");

    //double = double * double
    doubleResult = 400.00 * 256.3;
    printf("400.00 * 256.3 = (double)->%f\n", doubleResult);

    //int = double * double;
    intResult = 400.00 * 256.3;
    printf("400.00 * 256.3 = (int)->%d\n", intResult);

    printf("\n");

    //double = int * double
    doubleResult = 400 * 256.3;
    printf("400 * 256.3 = (double)->%f\n", doubleResult);

    //int = int * double
    intResult = 400 * 256.3;
    printf("400 * 256.3 = (int)->%d\n", intResult);

    printf("\n");

    //will double give me rounding error?
    if (((400.00 * 256.3) - 102520) != 0) {
        printf("Double give me rounding error!\n");
    }

    //will float give me rounding error?
    if (((400.00f * 256.3f) - 102520) != 0) {
        printf("Float give me rounding error!\n");
    }

    return 0;
}

Output from the code above

Upvotes: 6

Answers (3)

chux

Reputation: 154582

A key weakness to displaying the problem is the conversion to int intResult. The posted problem is about multiplying and comparing, but code only shows issues surrounding int conversion.

If code needs to convert a FP value to the nearest whole number, uses rint(), round(), nearbyint() or lround(), not integer assignment.

Upvotes: 2

Steve Summit

Reputation: 48083

First of all, understand that type double has all the same problems as type float. Neither type has infinite precision, so both types are susceptible to precision loss and other problems.

As to what you can do: there are many different problems that come up, depending on what you're doing, and many techniques to overcome them. Many, many words have been written on these techniques; I suggest doing a web search on "avoiding floating point error". But the basic points are:

Know that floating-point results are never exact
Don't try to compare floating-point numbers for exact equality
When comparing floating-point numbers for equality, use an appropriate "epsilon" range
After calculation, it is often appropriate to explicitly round the final value to the desired precision (especially when printing it out)
Beware of algorithms which cause the precision loss to increase with each step

Upvotes: 5

TheOperator

Reputation: 6516

If you have a fixed number of decimal digits (1 in the case of 256.3) as well as a bounded range of the results, you can use integer multiplication, and adjust for the shift in decimal digits through integer division:

int result = (400 * 2563) / 10;

Rounding errors are inherent to floating point arithmetics, except for a few cases where all operands can be represented exactly. Whether you choose float or double just influences when the error occurs, not if.

Upvotes: 6

Best practices for float multiplication in C++ or C?

Answers (3)

Related Questions