Reputation: 21099
I need to find whether two finite floating point values A
and B
have different signs or one of them is zero.
In many code examples I see the test as follows:
if ( (A <= 0 && B >= 0) || (A >= 0 && B <= 0) )
It works fine but looks inefficient to me since many conditions are verified here and each condition branch is a slow operation for modern CPUs.
The other option is to use multiplication of the floating-point values:
if ( A * B <= 0 )
It shall work even in case of overflow, since infinity values preserve proper sign.
What is the most efficient way for performing the check (one of these two or probably some other)?
Upvotes: 0
Views: 661
Reputation: 153348
Efficient check that two floating point values have distinct signs
if ( A * B <= 0 )
fails to distinguish the sign when A
or B
are of the set -0.0, +0.0
.
Consider signbit()
if (signbit(A) == signbit(B)) {
; // same sign
} else {
; // distinct signs
}
Trust the compiler to form efficient code - or use a better compiler.
OP then formed a different goal: "ether two finite floating point values A and B have distinct signs or one of them is zero."
Of course if (signbit(A) != signbit(B) || A == 0.0 || B == 0.0)
meets this newer functionality without the rounding to 0.0 issue @Eric Postpischil of if ( A * B <= 0 )
, but may not meet the fuzzy requirement of most efficient way.
The best way to form efficient code and avoid premature optimization really the root of all evil is to see the larger context.
Upvotes: 3
Reputation: 41764
It works fine but looks inefficient to me since many conditions are verified here and each condition branch is a slow operation for modern CPUs.
An optimizing compiler won't branch unnecessarily. Don't worry about premature optimization unless you're in a very tight hot loop. Your first snippet is compiled to only 2 branches in x86-64 and ARM64 in the compilers I tried. It can also be compiled to a branchless version. Of course it may still be slower than a simple A * B <= 0
but there's no way to know that for sure without a proper benchmark
If you don't care about NaN then you can simply do some bitwise operations:
auto a = std::bit_cast<uint64_t>(A);
auto b = std::bit_cast<uint64_t>(B);
const auto sign_mask = 1ULL << 63;
return ((a ^ b) & sign_mask) || (((a | b) & ~sign_mask) == 0);
If A and B have different signs then it'll be matched by (a ^ b) & sign_mask
. If they have same sign then they both must be zero which will be caught by the latter condition. But this works in the integer so it may incur a cross-domain penalty when moving the value from float to int domain
If std::bit_cast
not available then just replace with memcpy(&a, &A, sizeof A)
Again, do a benchmark to determine what's best for your target. There's no solution that's fastest on every microarchitecture available. If you really run this a lot of times in a loop then you should use SIMD instead to check for multiple values at the same time. You should also use profile-guided optimization in order for the compiler to know where and when to branch
Upvotes: 1