Reputation: 34337
There is nothing in the man-pages, that would suggest that -fno-signed-zeros
would imply -ffinite-math-only
:
-fno-signed-zeros
Allow optimizations for floating point arithmetic that ignore the signedness of zero. IEEE arithmetic specifies the behavior of distinct
+0.0
and-0.0
values, which then prohibits simplification of expressions such asx+0.0
or0.0*x
(even with-ffinite-math-only
). This option implies that the sign of a zero result isn't significant.The default is -fsigned-zeros.
However, there are observations which could be explained if it were the case. Problems in my code boil down to the following somewhat silly example:
#include <complex>
std::complex<double> mult(std::complex<double> c, double im){
std::complex<double> jomega(0.0, im);
return c*jomega;
}
The compiler would be tempted to optimize the multiplication c*=jomega
to something similar to c={-omega*c.imag(), omega*c.real()}
However, IEEE 754 compliance and at least the following corner cases prevent it:
A) signed zeros, e.g. omega=-0.0
, c={0.0, -0.0}
:
(c*jomega).real() = 0.0*0.0-(-0.0)*(-0.0) = 0.0
-c.imag()*omega = -(-0.0)*(-0.0) = -0.0 //different!
B) infinities, e.g. omega=0.0
, c={inf, 0.0}
:
(c*jomega).real() = inf*0.0-0.0*0.0 = nan
-c.imag()*omega = -(0.0)*(0.0) = -0.0 //different!
C) nan
s, e.g. omega=0.0
, c={inf, 0.0}
:
(c*jomega).real() = nan*0.0-0.0*0.0 = nan
-c.imag()*omega = -(0.0)*(0.0) = -0.0 //different!
That means, we have to use both, -ffinite-math-only
(for B and C) and -fno-signed-zeros
(for A), in order to allow the above optimization.
However, even with only -fno-signed-zeros
on, gcc performs the above optimization, if I understand the resulting assembler right (or see the listings below to see the effects):
mult(std::complex<double>, double):
mulsd %xmm2, %xmm1
movapd %xmm0, %xmm3
mulsd %xmm2, %xmm3
movapd %xmm1, %xmm0
movapd %xmm3, %xmm1
xorpd .LC0(%rip), %xmm0
ret
.LC0:
.long 0
.long -2147483648
.long 0
.long 0
My first tought was, that this could be a bug - but all recent gcc-versions I have at hand produce the same result, so I'm probably missing something.
Thus my question, why is gcc performing the above optimization only with -fno-signed-zeros
on and without -ffinite-math-only
?
Listings:
separate mult.cpp
to avoid funky precalculation during the compilation
#include <complex>
std::complex<double> mult(std::complex<double> c, double im){
std::complex<double> jomega(0.0, im);
return c*jomega;
}
main.cpp:
#include <complex>
#include <iostream>
#include <cmath>
std::complex<double> mult(std::complex<double> c, double im);
int main(){
//(-nan,-nan) expected:
std::cout<<"case INF: "<<mult(std::complex<double>(INFINITY,0.0),
0.0)<<"\n";
//(nan,nan) expected:
std::cout<<"case NAN: "<<mult(std::complex<double>(NAN,0.0), 0.0)<<"\n";
}
Compile and run:
>>> g++ main.cpp mult.cpp -O2 -fno-signed-zeros -o mult_test
>>> ./mult_test
case INF: (-0,-nan) //unexpected!
case NAN: (-0,nan) //unexpected!
Upvotes: 4
Views: 1460
Reputation: 34337
It was a misconception from my side, that the complex number multiplication is defined the same way it is learned in the school.
Basically, C++-standard isn't concerned with the complex multiplication, so probably the C-standard has to be consulted. Only since C99, the complex numbers are part of the standard (Appendix G), which yet does not define all results of the complex multiplication uniquely.
The most important definitions are:
0.0
or -0.0
).nans
.inf
or -inf
(even if the other one is nan
).It is not defined what is a complexnan
, so if one part is nan
, we can consider the complex number being nan
(as long as there is no infinite part).
The standard goes on to say, that the school-multiplication should hold for the most of the cases, but also that
if one operand is an infinity and the other operand is a nonzero finite number or an infinity,then the result of the operator is an infinity;
That means for example, that (1.0+0j)*(inf+inf*j)
should be infinite (inf+inf*j
would probably make most sense), but not nan+nan*j
as it would be the case for the usual formula.
There is more on this topic in my following SO-question.
Given, that the compiler has some freedom producing results, we can see that the difference between the used implementation via __multdc3
and the the simplified school formula is only for if signed zeros is taken into account, i.e. (-0,-0)vs.(0,-0)
and so on (see listing of the program testing it further below or see it here live).
That means, that the behavior of gcc is OK, because it uses undefined behavior of the standard. One could argue, that this is missed optimization of clang.
NB: There is also a "bug-report": https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84891
#include <complex>
#include <iostream>
#include <cmath>
#include <cfloat>
#include <vector>
int get_type(std::complex<double> c){
if(std::isinf(c.real()) || std::isinf(c.imag()))
return 2;
if(std::isnan(c.real()) || std::isnan(c.imag()))
return 1;
return 0;
}
void do_mult(double b, double c, double d){
std::complex<double> school(-b*d, b*c);
std::complex<double> f(0.0,b);
std::complex<double> s(c,d);
auto cstd=f*s;
int type1=get_type(school);
int type2=get_type(cstd);
#ifdef INFINITE_MATH
//not special, usual
if(type1!=type2 || (type1==0 && (cstd!=school))){
std::cout<<"(0.0,"<<b<<")*("<<c<<","<<d<<")="<<school<<"vs."<<cstd<<"\n";
}
#endif
#ifdef SIGNED_ZERO_MATH
// signed zero
if(type1!=type2 || (type1==0 && (1.0/cstd.real()!=1.0/school.real() || 1.0/cstd.imag()!=1.0/school.imag() ))){
std::cout<<"(0.0,"<<b<<")*("<<c<<","<<d<<")="<<school<<"vs."<<cstd<<"\n";
}
#endif
}
int main(){
std::vector<double> numbers{0.0, -0.0, 1.0, INFINITY, -INFINITY, NAN, DBL_MAX, -DBL_MAX};
for(double b: numbers)
for(double c: numbers)
for(double d: numbers)
do_mult(b,c,d);
}
To build/run use:
g++ main.cpp -o main -std=c++11 -DINFINITE_MATH -DSIGNED_ZERO_MATH && ./main
Upvotes: 4