Reputation: 187
I have a routine where is should test if a float number is less than zero or not. If yes I should store the sign and I get it is absolute values.
int sign = 1;
if (x < 0)
{
sign = -1;
}
x = fabs(x);
I looked into Intel SIMD intrinsics and I found this instruction dst = _mm_cmplt_ps(a,b)
that generate a vector containing (0xffffffff for true) or (0 for false) but I am stuck there; how can I know which element of the dst
vector is negative or not to build the sign_vector
.
Upvotes: 1
Views: 237
Reputation: 20057
With one small exception (x==+0.0f), you can generate the integer mask using the instruction sign
:
_mm_sign_epi32(_mm_set1_epi32(1), x)
This will negate 1, if x<0, but produces the sign==0 when x==0.
If 0 is not allowed, x can be made nonzero by oring with 0<mask<0x80000000
.
auto s=_mm_set1_epi32(1);
auto y=_mm_or_si128(x, s); // fix for x==0
s=_mm_sign_epi32(s, y);
Upvotes: 3
Reputation: 213160
Assuming your input values are in a vector __m128 v
:
__m128 vmask = _mm_set1_ps(-0.0f); // create sign bit mask
__m128 vsign = _mm_and_ps(v, vmask); // create vector of sign bits (MSB)
__m128i vsigni = _mm_add_epi32(_mm_srai_epi32((__m128i)vsign, 30), _mm_set1_epi32(1));
// convert sign bits to integer +1/-1 (if needed (*))
v = _mm_andnot_ps(vmask, v); // clear sign bits in v (i.e. v = fabsf(v))
(*) rather than wasting cycles generating +1/-1 for the sign, consider whether you could just work with the sign bits directly, and omit this step.
Upvotes: 2