jnrbsn
jnrbsn

Reputation: 2533

Why can't I reproduce floating-point issues with pow function?

I recently wrote some code similar to this:

// Calculate a ^ b
unsigned int result = 1;
for (unsigned int i = 0; i < b; i++) {
    result *= a;
}

I got the comment that I should have used pow from math.h, and I was all ready to point out the floating-point precision issues because pow returns a double. I even found an existing Stack Overflow question that illustrates the problem. Except when I tried to run the code from that other question, it "worked" just fine (i.e. no off-by-1 issues).

Here's the code I've been testing:

#include <math.h>
#include <stdio.h>

int main()
{
    unsigned int a = 10;
    unsigned int result;
    for (unsigned int b = 1; b <= 9; b++) {
        result = (unsigned int)pow(a, b);
        printf("%u\n", result);
    }
    return 0;
}

And here's how I'm compiling and running it (on Ubuntu 18.04.3 with GCC version 7.4.0):

$ gcc -O3 -std=c99 -Wall -Wextra -Werror -Wpedantic -pedantic-errors -o pow_test pow_test.c -lm
$ ./pow_test
10
100
1000
10000
100000
1000000
10000000
100000000
1000000000

So why does (unsigned int)pow(a, b) work? I'm assuming the compiler is doing some magic to prevent the normal floating-point issues? What is it doing and why? Seems like kind of a bad idea to encourage the ignorance of these issues if not all compilers do this. And if all modern compilers do do this, is this not a problem you have to worry about as much anymore?

I've also seen people suggest something like (unsigned int)(pow(a, b) + 0.1) to avoid off-by-1 issues. What are the advantages/disadvantages of doing that vs. my solution with the loop?

Upvotes: 2

Views: 834

Answers (2)

Renat
Renat

Reputation: 9002

pow has double arguments and return value (cppreference.com). double has 53 bit significand precision (Wikipedia). On the other hand maximum value in the example is 1000000000, which has 30 bits, so perfectly fits without loosing precision into the 53 bits.

Update: played with Godbolt and it turned out that compiler could even optimize out all the computations, as in your example a and all the b values are known at compile time, for example clang v9.0. with -O3 option just prints constants (and gcc and msvc do a call to pow)

Upvotes: 4

Barmar
Barmar

Reputation: 782564

This is implementation-dependent.

The implementation that was being used in the linked question apparently uses transcendental functions to implement pow(), even when the arguments are precise integers. As a result, it gets round-off errors, so the result is sometimes slightly lower than the actual integer result.

The implementation you're using apparently checks whether the arguments have zero fractions. In that case, it uses an algorithm that produces accurate integer results (it can be done by shifting the mantissa and some additional multiplications). As long as the result fits within the precision of double, you get a float that will convert back to the precise integer.

Upvotes: 3

Related Questions