Why does this code return two different values doing the same?

Question

#include 
#include 
#include 

int main()
{
    double a;
    double b;
    double q0 = 0.5 * M_PI + 0.5 * -2.1500000405000002;
    double q1 = 0.5 * M_PI + 0.5 * 0.0000000000000000;
    double w0 = 0.5 * M_PI + 0.5 * -43000.0008100000050000;
    double w1 = 0.5 * M_PI + 0.5 * -0.0000000000000000;
    double m = 1;
    double g = 43000000.81;
    double l1 = 0.1;
    double l2 = 0.1;
    double h = 0.0001;

    a = ((-g / l1) * sin(q0) + (sin(q1 - q0) * (cos(q1 - q0) * (w0 * w0 + (g / l1) * cos(q0)) + l2 * (w1 * w1 / l1))) / (m + pow(sin(q1 - q0), 2)));
    a = h * a;

    b = h * ((-g / l1) * sin(q0) + (sin(q1 - q0) * (cos(q1 - q0) * (w0 * w0 + (g / l1) * cos(q0)) + l2 * (w1 * w1 / l1))) / (m + pow(sin(q1 - q0), 2)));

    printf("%.20lf ", a);
    printf("%.20lf", b);
    return 0;
}

I do the same calculations with a and b, just with the difference that I get the value of a in two steps, and the b in one.

My code returns: -629.47620126173774000000 -629.47620126173763000000

What is the reason of the difference between the two last decimals?

aschepler · Accepted Answer

The C Standard (99 and 11) says:

The values of operations with floating operands and values subject to the usual arithmetic conversions and of floating constants are evaluated to a format whose range and precision may be greater than required by the type.

So in an expression such as h*(X+Y) as you have in the assignment to b, the implementation is allowed to use greater precision for the intermediate result of X+Y than can possibly be stored in a double, even though the type of the subexpression is still considered to be double. But in a=X+Y; a=h*a;, the first assignment forces the value to be one that actually can be stored in a double, causing a slightly different result.

Another possibility is that the compiler has done "floating point contraction". To quote the C Standard again,

A floating expression may be contracted, that is, evaluated as though it were an atomic operation, thereby omitting rounding errors implied by the source code and the expression evaluation method.

This would most likely happen if the processor has a single instruction that can do a floating point addition and then a multiplication in one step, and the compiler decided to use it.

Assuming one or both of these is the cause, your value b is likely a more accurate representation of the computation you specified (given that all the inputs were restricted to values which can be represented in a double).

The cppreference page about macro FLT_EVAL_METHOD discusses both of these issues in a little more detail. It may be interesting to find out your value of FLT_EVAL_METHOD and play with #pragma STDC FP_CONTRACT OFF.

Why does this code return two different values doing the same?

Answers (2)

Related Questions