Reputation: 13
I noticed that the code below compiled with clang 11.0.3 gives different result when I use -O0 and -O3 flag.
#include <stdio.h>
#include <inttypes.h>
int64_t foo(int64_t a, int32_t b, int32_t c) {
const int32_t p1 = 7654321;
const int32_t p2 = 8765432;
const int64_t p3 = 1234567LL;
const int32_t p4 = 987654;
const int64_t e = a + b * b * p1 + b * p2 + c * c * p3 + c * p4;
return e;
}
int main(void) {
const int64_t a = 1234LL;
int32_t b = 130;
int32_t c = -148;
printf("%lld\n", foo(a, b, c)); // -O0: 28544296190, -O3: 28544296190
b = 167;
c = -93;
printf("%lld\n", foo(a, b, c)); // -O0: 10772740108, -O3: 15067707404
return 0;
}
First result is the same however the second one differs. I thought it happens because of implicit type conversion. I compiled the code to assembly with -O0 flag to see in what order all computations are performed. According to that I added explicit casting and parenthesis in function foo:
const int64_t e = (((a + (int64_t)(b * b * p1)) + (int64_t)(b * p2)) + (int64_t)((int64_t)(c * c) * p3)) + (int64_t)(c * p4);
This did not help though and I really do not know how to fix it. How the code should look like in order to work properly with O3 optimizations?
Upvotes: 1
Views: 50
Reputation: 224437
You've got overflow here:
b * b * p1
When b
is 167, you first have (int32_t)167 * (int32_t)167 == (int32_t)27889. Then you have (int32_t)27889 * (int32_t)7654321 == 213471358369 which is outside the range of a signed 32 bit integer. Overflow on signed integers invokes undefined behavior which clang apparently exploited at -O3
.
The casting you did wasn't sufficient because the cast was applied after the overflow occurred. You need to add the cast to at least the first operand of each multiplication so that all operands are converted to int64_t
.
const int64_t e = a + (int64_t)b * b * p1 + (int64_t)b * p2 + (int64_t)c * c * p3 + (int64_t)c * p4;
Upvotes: 1