A.SDR
A.SDR

Reputation: 187

Transform comparison routine to Intel SIMD

I have a routine where is should test if a float number is less than zero or not. If yes I should store the sign and I get it is absolute values.

int sign = 1;
if (x < 0)
{
    sign = -1;
}
x = fabs(x);

I looked into Intel SIMD intrinsics and I found this instruction dst = _mm_cmplt_ps(a,b) that generate a vector containing (0xffffffff for true) or (0 for false) but I am stuck there; how can I know which element of the dst vector is negative or not to build the sign_vector.

Upvotes: 1

Views: 237

Answers (2)

Aki Suihkonen
Aki Suihkonen

Reputation: 20057

With one small exception (x==+0.0f), you can generate the integer mask using the instruction sign:

_mm_sign_epi32(_mm_set1_epi32(1), x)

This will negate 1, if x<0, but produces the sign==0 when x==0.

If 0 is not allowed, x can be made nonzero by oring with 0<mask<0x80000000.

auto s=_mm_set1_epi32(1);
auto y=_mm_or_si128(x, s); // fix for x==0
s=_mm_sign_epi32(s, y);

Upvotes: 3

Paul R
Paul R

Reputation: 213160

Assuming your input values are in a vector __m128 v:

__m128 vmask = _mm_set1_ps(-0.0f);      // create sign bit mask
__m128 vsign = _mm_and_ps(v, vmask);    // create vector of sign bits (MSB)
__m128i vsigni = _mm_add_epi32(_mm_srai_epi32((__m128i)vsign, 30), _mm_set1_epi32(1));
                                        // convert sign bits to integer +1/-1 (if needed (*))
v = _mm_andnot_ps(vmask, v);            // clear sign bits in v (i.e. v = fabsf(v))

(*) rather than wasting cycles generating +1/-1 for the sign, consider whether you could just work with the sign bits directly, and omit this step.

Upvotes: 2

Related Questions