Julien-L
Julien-L

Reputation: 5416

C++, How floating-point arithmetic operations get optimized?

I observed a surprising behavior when testing simple arithmetic operations at limit cases, on an x86 architecture:

const double max = 9.9e307; // Near std::numeric_limits<double>::max()
const double init[] = { max, max, max };

const valarray<double> myvalarray(init, 3);
const double mysum = myvalarray.sum();
cout << "Sum is " << mysum << endl;             // Sum is 1.#INF
const double myavg1 = mysum/myvalarray.size();
cout << "Average (1) is " << myavg1 << endl;    // Average (1) is 1.#INF
const double myavg2 = myvalarray.sum()/myvalarray.size();
cout << "Average (2) is " << myavg2 << endl;    // Average (2) is 9.9e+307

(Tested with MSVC in release mode, as well as gcc through Codepad.org. MSVC's debug mode sets average (2) to #INF.)

I expected average (2) to be equal to average (1), but it seems to me the C++ built-in division operator got optimized by the compiler and somehow prevented the accumulation to reach #INF.
In short: The average of big numbers doesn't yields #INF.

I observed the same behavior with an std algorithm on MSVC:

const double mysum = accumulate(init, init+3, 0.);
cout << "Sum is " << mysum << endl;             // Sum is 1.#INF
const double myavg1 = mysum/static_cast<size_t>(3);
cout << "Average (1) is " << myavg1 << endl;    // Average (1) is 1.#INF
const double myavg2 = accumulate(init, init+3, 0.)/static_cast<size_t>(3);
cout << "Average (2) is " << myavg2 << endl;    // Average (2) is 9.9e+307

(This time however, gcc set average (2) to #INF: http://codepad.org/C5CTEYHj.)

  1. Would someone care to explain how this "effect" was achieved?
  2. Is that a "feature"? Or can I consider this an "unexpected behavior" instead of simply "surprising"?

Thanks

Upvotes: 10

Views: 1357

Answers (4)

James Kanze
James Kanze

Reputation: 153899

It's sort of a feature, or at least, it's intentional. Basically, floating point registers on an x86 have more precision and range than a double (15 bit exponent, rather than 11, 64 bit matissa, rather than 52). The C++ standard allows using higher precision and range for intermediate values, and just about any compiler for Intel will do so in some circumstances; the difference in performance is significant. Whether you get the extended precision or not depends on when and whether the compiler spills to memory. (Saving a result in a named variable requires the compiler to convert it to actual double precision, at least according to the standard.) The worse case of this I've seen was some code which basically did:

return p1->average() < p2->average()

, with average() doing what you'd expect on an internal table in the data. In some cases, p1 and p2 would in fact point to the same element, but the return value would still be true; the results of one of the function calls would be spilled to memory (and truncated to double), the results of the other remained in a floating point register.

(The function was being used as an ordering function for sort, and the resulting code crashed, since because of this effect, it didn't defined a sufficiently strict ordering criteria, and the sort code when outside the range passed to it.)

Upvotes: 2

sehe
sehe

Reputation: 392833

g++ -O0 -g -S test.cpp  -o test.s0
g++ -O3 -g -S test.cpp  -o test.s3

comparing test.s[03] shows that indeed valarray::sum isn't even called again. I' haven't looked at it for long, but the following snippets seem to be the defining fragments:

    .loc 3 16 0 ; test.s0

    leal    -72(%ebp), %eax
    movl    %eax, (%esp)
    call    __ZNKSt8valarrayIdE3sumEv
    fstpl   -96(%ebp)
    leal    -72(%ebp), %eax
    movl    %eax, (%esp)
    call    __ZNKSt8valarrayIdE4sizeEv
    movl    $0, %edx
    pushl   %edx
    pushl   %eax
    fildq   (%esp)
    leal    8(%esp), %esp
    fdivrl  -96(%ebp)
    fstpl   -24(%ebp)

    .loc 3 17 0

versus

    .loc 1 16 0 ; test.s3
    faddl   16(%eax)
    fdivs   LC3
    fstpl   -336(%ebp)
LVL6:
LBB449:
LBB450:
    .loc 4 514 0

Upvotes: 1

AProgrammer
AProgrammer

Reputation: 52274

There are situations when the compiler can use a wider type than the one implied by the declared type, but AFAIK, this isn't one of them.

I think thus that we have an effect similar to the one of Gcc bug 323 where additional precision is used when it shouldn't.

x86 has 80 bits FP internal registers. While gcc tends to use them with maximal precision (thus bug 323), my understanding is that MSVC set the precision to 53 bits, the one of 64 bits double. But the lengthened significant isn't the only difference of 80 bits FP, the range of exponent is also increased. And IIRC, there is no settings in x86 forcing the use of the range of 64 bits double.

codepad seems unaccessible now, or I'd test your code on a architecture without 80 bits long double.

Upvotes: 1

Christopher Creutzig
Christopher Creutzig

Reputation: 8774

Just a guess, but: It may be that Average (2) is computed directly in the floating point registers, which have a width of 80 bits and overflow later than the 64 bit storage for doubles in memory. You should check the disassembly for your code to see if that is indeed the case.

Upvotes: 4

Related Questions