Reputation: 87
Recently I encountered a weird issue regarding LTO and -ffast-math
where I got inconsistent result for my "pow" ( in cmath
) calls depending on whether -flto
is used.
$ g++ --version
g++ (GCC) 8.3.0
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
$ ll /lib64/libc.so.6
lrwxrwxrwx 1 root root 12 Sep 3 2019 /lib64/libc.so.6 -> libc-2.17.so
$ ll /lib64/libm.so.6
lrwxrwxrwx 1 root root 12 Sep 3 2019 /lib64/libm.so.6 -> libm-2.17.so
$ cat /etc/redhat-release
CentOS Linux release 7.5.1804 (Core)
fixed.hxx
#include <cstdint>
double Power10f(const int16_t power);
fixed.cxx
#include "fixed.hxx"
#include <cmath>
double Power10f(const int16_t power)
{
return pow(10.0, (double) power);
}
test.cxx
#include <iostream>
#include <cmath>
#include <iomanip>
#include <cstdint>
#include "fixed.hxx"
int main(int argc, char** argv)
{
if (argc >= 3) {
int64_t value = (int64_t)atoi(argv[1]);
int16_t power = (int16_t)atoi(argv[2]);
double x = Power10f(power);
std::cout.precision(17);
std::cout << std::scientific << x << std::endl;
std::cout << std::scientific << (double)value * x << std::endl;
return 0;
}
return 1;
}
Compile it with -ffast-math
and with/without -flto
gives different results
-flto
will eventually call the __pow_finite
version and gives the an "accurate" result:$ g++ -O3 -DNDEBUG -ffast-math -std=c++17 -flto -o fixed.cxx.o -c fixed.cxx
$ g++ -O3 -DNDEBUG -o fdtest fixed.cxx.o test.cxx
$ ./fdtest 81 20
1.00000000000000000e+20
8.10000000000000000e+21
$ objdump -DC fdtest > fdtest.dump
$ cat fdtest.dump
...
0000000000400930 <Power10f(short)>:
400930: 0f bf ff movswl %di,%edi
400933: 66 0f ef c9 pxor %xmm1,%xmm1
400937: f2 0f 10 05 99 00 00 movsd 0x99(%rip),%xmm0 # 4009d8 <_IO_stdin_used+0x8>
40093e: 00
40093f: f2 0f 2a cf cvtsi2sd %edi,%xmm1
400943: e9 d8 fd ff ff jmpq 400720 <__pow_finite@plt>
400948: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1)
40094f: 00
...
-flto
eventually calls __exp_finite
( as an optimization enabled by -ffast-math
if I guess right ), and gives an "inaccurate" result.$ g++ -O3 -DNDEBUG -ffast-math -std=c++17 -o fixed.cxx.o -c fixed.cxx
$ g++ -O3 -DNDEBUG -o fdtest fixed.cxx.o test.cxx
$ ./fdtest 81 20
1.00000000000000786e+20
8.10000000000006396e+21
$ objdump -DC fdtest > fdtest.dump
$ cat fdtest.dump
...
0000000000400930 <Power10f(short)>:
400930: 0f bf ff movswl %di,%edi
400933: 66 0f ef c0 pxor %xmm0,%xmm0
400937: f2 0f 2a c7 cvtsi2sd %edi,%xmm0
40093b: f2 0f 59 05 95 00 00 mulsd 0x95(%rip),%xmm0 # 4009d8 <_IO_stdin_used+0x8>
400942: 00
400943: e9 88 fd ff ff jmpq 4006d0 <__exp_finite@plt>
400948: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1)
40094f: 00
...
Is the above example expected behavior or is there something wrong with my code that caused this unexpected behavior?
The same result can also be observed on some other platforms ( e.g. ArchLinux with g++ 12.1 and glibc 2.35 ).
Upvotes: 1
Views: 158
Reputation: 41474
-ffast-math
gives the compiler permission to be inconsistent for whatever reasons it wants. Modifying even notionally unrelated code in the function could easily lead to pow
returning different results thanks to different optimization strategies being chosen. And -flto
changes quite a bit about how/when optimization is done, so there's a lot of room for that to happen.
If you care about numerical precision, or numeric consistency, or numerics in general, do not use -ffast-math
. The transformations it performs are generally available to you as a programmer, and if you do them yourself, you can rely on their consistency.
Upvotes: 0
Reputation: 119877
man gcc:
To use the link-time optimizer,
-flto
and optimization options should be specified at compile time and during the final link. It is recommended that you compile all the files participating in the same link with the same options and also specify those options at link time. For example:gcc -c -O2 -flto foo.c gcc -c -O2 -flto bar.c gcc -o myprog -flto -O2 foo.o bar.o
Upvotes: 6