Jdarc
Jdarc

Reputation: 75

SSE - compare and put my value?

I am on this intel intrinsic guide page.

My sse experience is kind of brittle.


Ok, I have an array - a long one, really- of ints named 'source'.

Example :

I want to change some of its values if it match a certain value.

int source[] = {4,5,9,8}
int mask[] = {4,4,4,4}
int replacer[] = {3,3,3,3} 

So the final source should look like {3,5,9,8}

I would like to achieve this using SSE < 4.

The closest instruction I came across is _mm_cmpeq_epi32:

FOR j := 0 to 3 
i := j*32 dst[i+31:i] := ( a[i+31:i] == b[i+31:i] ) ? 0xFFFFFFFF : 0 
ENDFOR

Now I would like something to replace the original array with my value, or do nothing otherwise:

FOR j := 0 to 3 
i := j*32 dst[i+31:i] := ( a[i+31:i] == b[i+31:i] ) ? my_mask_value_here : source_value_untouched
ENDFOR

Is there remotely something achieving what I am trying ? Ican't figure out even when combining different instructions..

Thanks

Upvotes: 1

Views: 592

Answers (1)

Jester
Jester

Reputation: 58762

Having gotten your mask using the PCMPEQ, if you have sse 4.1 then you can use the PBLENDVB instruction which is specifically for this purpose. Otherwise, you can use PAND, PANDN and POR to emulate it. Also, MASKMOVDQU can be used.

Here is the source code demonstrating the 3 ways:

#include <stdio.h>
#include <x86intrin.h>

int main()
{
    int source[] = {4,5,9,8};
    int mask[] = {4,4,4,4};
    int replacer[] = {3,3,3,3};

    __m128i bitmask = _mm_cmpeq_epi32(*(__m128i*)source, *(__m128i*)mask);

    // manual version
    __m128i result = _mm_and_si128(*(__m128i*)replacer, bitmask);
    __m128i tmp = _mm_andnot_si128(bitmask, *(__m128i*)source);
    result = _mm_or_si128(result, tmp);
    printf("%d %d %d %d\n", *(int*)&result, *((int*)&result + 1), *((int*)&result + 2), *((int*)&result + 3));

    // maskmovdqu version
    result = *(__m128i*)source;
    _mm_maskmoveu_si128(*(__m128i*)replacer, bitmask, (char*)&result);
    printf("%d %d %d %d\n", *(int*)&result, *((int*)&result + 1), *((int*)&result + 2), *((int*)&result + 3));

    // sse 4.1 version
    result = _mm_blendv_epi8(*(__m128i*)source, *(__m128i*)replacer, bitmask);
    printf("%d %d %d %d\n", *(int*)&result, *((int*)&result + 1), *((int*)&result + 2), *((int*)&result + 3));
}

Upvotes: 4

Related Questions