ead
ead

Reputation: 34337

Why does -fno-signed-zeros alone enable optimization, for which seemingly also -ffinite-math-only is needed (gcc)

There is nothing in the man-pages, that would suggest that -fno-signed-zeros would imply -ffinite-math-only:

-fno-signed-zeros

Allow optimizations for floating point arithmetic that ignore the signedness of zero. IEEE arithmetic specifies the behavior of distinct +0.0 and -0.0 values, which then prohibits simplification of expressions such as x+0.0 or 0.0*x (even with -ffinite-math-only). This option implies that the sign of a zero result isn't significant.

The default is -fsigned-zeros.

However, there are observations which could be explained if it were the case. Problems in my code boil down to the following somewhat silly example:

#include <complex>

std::complex<double> mult(std::complex<double> c, double im){
    std::complex<double> jomega(0.0, im);
    return c*jomega;
}

The compiler would be tempted to optimize the multiplication c*=jomega to something similar to c={-omega*c.imag(), omega*c.real()} However, IEEE 754 compliance and at least the following corner cases prevent it:

A) signed zeros, e.g. omega=-0.0, c={0.0, -0.0}:

 (c*jomega).real() = 0.0*0.0-(-0.0)*(-0.0) =  0.0
 -c.imag()*omega   = -(-0.0)*(-0.0)        = -0.0  //different!

B) infinities, e.g. omega=0.0, c={inf, 0.0}:

 (c*jomega).real() = inf*0.0-0.0*0.0 =  nan
 -c.imag()*omega   = -(0.0)*(0.0)    = -0.0     //different!

C) nans, e.g. omega=0.0, c={inf, 0.0}:

 (c*jomega).real() = nan*0.0-0.0*0.0 =  nan
 -c.imag()*omega   = -(0.0)*(0.0)    = -0.0    //different!

That means, we have to use both, -ffinite-math-only (for B and C) and -fno-signed-zeros (for A), in order to allow the above optimization.

However, even with only -fno-signed-zeros on, gcc performs the above optimization, if I understand the resulting assembler right (or see the listings below to see the effects):

mult(std::complex<double>, double):
        mulsd   %xmm2, %xmm1
        movapd  %xmm0, %xmm3
        mulsd   %xmm2, %xmm3
        movapd  %xmm1, %xmm0
        movapd  %xmm3, %xmm1
        xorpd   .LC0(%rip), %xmm0
        ret
.LC0:
        .long   0
        .long   -2147483648
        .long   0
        .long   0

My first tought was, that this could be a bug - but all recent gcc-versions I have at hand produce the same result, so I'm probably missing something.

Thus my question, why is gcc performing the above optimization only with -fno-signed-zeros on and without -ffinite-math-only?


Listings:

separate mult.cpp to avoid funky precalculation during the compilation

#include <complex>

std::complex<double> mult(std::complex<double> c, double im){
       std::complex<double> jomega(0.0, im);
       return c*jomega;
}

main.cpp:

#include <complex>
#include <iostream>
#include <cmath>

std::complex<double> mult(std::complex<double> c, double im);


int main(){
     //(-nan,-nan) expected:
     std::cout<<"case INF: "<<mult(std::complex<double>(INFINITY,0.0),
 0.0)<<"\n";

     //(nan,nan) expected:
     std::cout<<"case NAN: "<<mult(std::complex<double>(NAN,0.0),  0.0)<<"\n"; 
}

Compile and run:

>>> g++ main.cpp mult.cpp -O2 -fno-signed-zeros -o mult_test
>>> ./mult_test
case INF: (-0,-nan)   //unexpected!
case NAN: (-0,nan)    //unexpected!

Upvotes: 4

Views: 1460

Answers (1)

ead
ead

Reputation: 34337

It was a misconception from my side, that the complex number multiplication is defined the same way it is learned in the school.

Basically, C++-standard isn't concerned with the complex multiplication, so probably the C-standard has to be consulted. Only since C99, the complex numbers are part of the standard (Appendix G), which yet does not define all results of the complex multiplication uniquely.

The most important definitions are:

  1. a complex number is zero when both parts are zero (0.0 or -0.0).
  2. a complex number is finite when both parts are finite and not nans.
  3. a complex number is infinite when real or imaginary (or both) parts are inf or -inf (even if the other one is nan).

It is not defined what is a complexnan, so if one part is nan, we can consider the complex number being nan (as long as there is no infinite part).

The standard goes on to say, that the school-multiplication should hold for the most of the cases, but also that

if one operand is an infinity and the other operand is a nonzero finite number or an infinity,then the result of the operator is an infinity;

That means for example, that (1.0+0j)*(inf+inf*j) should be infinite (inf+inf*j would probably make most sense), but not nan+nan*j as it would be the case for the usual formula.

There is more on this topic in my following SO-question.

Given, that the compiler has some freedom producing results, we can see that the difference between the used implementation via __multdc3 and the the simplified school formula is only for if signed zeros is taken into account, i.e. (-0,-0)vs.(0,-0) and so on (see listing of the program testing it further below or see it here live).

That means, that the behavior of gcc is OK, because it uses undefined behavior of the standard. One could argue, that this is missed optimization of clang.

NB: There is also a "bug-report": https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84891


#include <complex>
#include <iostream>
#include <cmath>
#include <cfloat>
#include <vector>


int get_type(std::complex<double> c){
  if(std::isinf(c.real()) || std::isinf(c.imag()))
       return 2;
  if(std::isnan(c.real()) || std::isnan(c.imag()))
       return 1;
  return 0;
}

void do_mult(double b, double c, double d){
     std::complex<double> school(-b*d, b*c);
     std::complex<double> f(0.0,b);
     std::complex<double> s(c,d);
     auto cstd=f*s;

     int type1=get_type(school);
     int type2=get_type(cstd);

     #ifdef INFINITE_MATH
                        //not special,    usual            
     if(type1!=type2 || (type1==0 &&  (cstd!=school))){
               std::cout<<"(0.0,"<<b<<")*("<<c<<","<<d<<")="<<school<<"vs."<<cstd<<"\n";
     }

     #endif

     #ifdef SIGNED_ZERO_MATH
                                                //       signed zero
     if(type1!=type2 || (type1==0 &&  (1.0/cstd.real()!=1.0/school.real() || 1.0/cstd.imag()!=1.0/school.imag() ))){
               std::cout<<"(0.0,"<<b<<")*("<<c<<","<<d<<")="<<school<<"vs."<<cstd<<"\n";
     }
     #endif
}

int main(){
       std::vector<double> numbers{0.0, -0.0, 1.0, INFINITY, -INFINITY, NAN, DBL_MAX, -DBL_MAX};
       for(double b: numbers)
         for(double c: numbers)
           for(double d: numbers)
               do_mult(b,c,d);
}

To build/run use:

g++ main.cpp -o main -std=c++11 -DINFINITE_MATH -DSIGNED_ZERO_MATH && ./main

Upvotes: 4

Related Questions