Reputation: 757
I want to write a c program which counts the number of bytes in a range a
...c
with below code:
char a[16], b[16], c[16];
int counter = 0;
for(i = 0; i < 16; i++)
{
if((a[i] < b[i]) && (b[i] < c[i]))
counter++;
}
return counter;
I am planning to do something like this
__m128i result1 = _mm_cmpgt_epi8 (b, a);
__m128i result2 = _mm_cmplt_epi8 (b, c);
unsigned short out1 = _mm_movemask_epi8(result1);
unsigned short out2 = _mm_movemask_epi8(result2);
unsigned short out3 = out1 & out2;
unsigned short out4 = _mm_popcnt_u32(out3);
Is my method correct? Is there is a better way to do this?
Upvotes: 2
Views: 396
Reputation: 11768
Your approach looks pretty reasonable. I think you could save an instruction by doing the AND inside the SIMD registers, like this:
__m128i result1 = _mm_cmpgt_epi8 (b, a);
__m128i result2 = _mm_cmplt_epi8 (b, c);
__m128i mask = _mm_and_si128(result1, result2);
int mask2 = _mm_movemask_epi8(mask);
int counter = _mm_popcnt_u32(mask2);
Upvotes: 5