Liu Wei
Liu Wei

Reputation: 87

Weird LTO behavior with -ffast-math

Summary

Recently I encountered a weird issue regarding LTO and -ffast-math where I got inconsistent result for my "pow" ( in cmath ) calls depending on whether -flto is used.

Environment:

$ g++ --version
g++ (GCC) 8.3.0
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ ll /lib64/libc.so.6
lrwxrwxrwx 1 root root 12 Sep  3  2019 /lib64/libc.so.6 -> libc-2.17.so

$ ll /lib64/libm.so.6
lrwxrwxrwx 1 root root 12 Sep  3  2019 /lib64/libm.so.6 -> libm-2.17.so

$ cat /etc/redhat-release 
CentOS Linux release 7.5.1804 (Core) 

Minimal Example

Code

#include <cstdint>
double Power10f(const int16_t power);
#include "fixed.hxx"
#include <cmath>

double Power10f(const int16_t power)
{
    return pow(10.0, (double) power);
}
#include <iostream>
#include <cmath>
#include <iomanip>
#include <cstdint>
#include "fixed.hxx"

int main(int argc, char** argv)
{
    if (argc >= 3) {
        int64_t value = (int64_t)atoi(argv[1]);
        int16_t power = (int16_t)atoi(argv[2]);
        double x = Power10f(power);
        std::cout.precision(17);
        std::cout << std::scientific << x << std::endl;
        std::cout << std::scientific << (double)value * x << std::endl;
        return 0;   
    }
    return 1;
}

Compile & Run

Compile it with -ffast-math and with/without -flto gives different results

$ g++ -O3 -DNDEBUG -ffast-math -std=c++17 -flto  -o fixed.cxx.o -c fixed.cxx
$ g++ -O3 -DNDEBUG   -o fdtest fixed.cxx.o test.cxx
$ ./fdtest 81 20
1.00000000000000000e+20
8.10000000000000000e+21
$ objdump -DC fdtest > fdtest.dump
$ cat fdtest.dump
...
0000000000400930 <Power10f(short)>:
  400930:       0f bf ff                movswl %di,%edi
  400933:       66 0f ef c9             pxor   %xmm1,%xmm1
  400937:       f2 0f 10 05 99 00 00    movsd  0x99(%rip),%xmm0        # 4009d8 <_IO_stdin_used+0x8>
  40093e:       00 
  40093f:       f2 0f 2a cf             cvtsi2sd %edi,%xmm1
  400943:       e9 d8 fd ff ff          jmpq   400720 <__pow_finite@plt>
  400948:       0f 1f 84 00 00 00 00    nopl   0x0(%rax,%rax,1)
  40094f:       00
...
$ g++ -O3 -DNDEBUG -ffast-math -std=c++17  -o fixed.cxx.o -c fixed.cxx
$ g++ -O3 -DNDEBUG   -o fdtest fixed.cxx.o test.cxx
$ ./fdtest 81 20
1.00000000000000786e+20
8.10000000000006396e+21
$ objdump -DC fdtest > fdtest.dump
$ cat fdtest.dump
...
0000000000400930 <Power10f(short)>:
  400930:       0f bf ff                movswl %di,%edi
  400933:       66 0f ef c0             pxor   %xmm0,%xmm0
  400937:       f2 0f 2a c7             cvtsi2sd %edi,%xmm0
  40093b:       f2 0f 59 05 95 00 00    mulsd  0x95(%rip),%xmm0        # 4009d8 <_IO_stdin_used+0x8>
  400942:       00 
  400943:       e9 88 fd ff ff          jmpq   4006d0 <__exp_finite@plt>
  400948:       0f 1f 84 00 00 00 00    nopl   0x0(%rax,%rax,1)
  40094f:       00
...

Question

Is the above example expected behavior or is there something wrong with my code that caused this unexpected behavior?

Update

The same result can also be observed on some other platforms ( e.g. ArchLinux with g++ 12.1 and glibc 2.35 ).

Upvotes: 1

Views: 158

Answers (2)

Sneftel
Sneftel

Reputation: 41474

-ffast-math gives the compiler permission to be inconsistent for whatever reasons it wants. Modifying even notionally unrelated code in the function could easily lead to pow returning different results thanks to different optimization strategies being chosen. And -flto changes quite a bit about how/when optimization is done, so there's a lot of room for that to happen.

If you care about numerical precision, or numeric consistency, or numerics in general, do not use -ffast-math. The transformations it performs are generally available to you as a programmer, and if you do them yourself, you can rely on their consistency.

Upvotes: 0

n. m. could be an AI
n. m. could be an AI

Reputation: 119877

man gcc:

To use the link-time optimizer, -flto and optimization options should be specified at compile time and during the final link. It is recommended that you compile all the files participating in the same link with the same options and also specify those options at link time. For example:

              gcc -c -O2 -flto foo.c
              gcc -c -O2 -flto bar.c
              gcc -o myprog -flto -O2 foo.o bar.o

Upvotes: 6

Related Questions