user997112
user997112

Reputation: 30615

Check for zeros horizontally across __m128i vector?

I have several __m128i vectors containing 32-bit unsigned integers and I would like to check whether any of the 4 integers is a zero.

I understand how I can "aggregate" the multiple __m128i vectors but eventually I will still end up with a single __m128i vector, which I will then need to check horizontally.

How do I perform the final horizontal check for zero across the last vector?

EDIT I am using Intel intrinsics, not inline assembly

Upvotes: 0

Views: 488

Answers (1)

Stephen Canon
Stephen Canon

Reputation: 106197

Don’t do it. Avoid horizontal operation whenever possible; it is death to performance of vector code.

Instead, compare the vector to a vector of zeros, then use PMOVMSKB to get a mask in GPR. If that mask is non-zero, at least one of the lanes of your vector was zero:

__m128i yourVector;
__m128i zeroVector = _mm_set1_epi32(0);

if (_mm_movemask_epi8(_mm_cmpeq_epi32(yourVector,zeroVector))) {
    // at least one lane of your vector is zero.
}

You can also use PTEST if you want to assume SSE4.1.


Taking the question at face value, if you really did need to do a horizontal and for some reason, it would be movhlps + andps + shufps + andps. But don’t do that.

Upvotes: 6

Related Questions