baptiste
baptiste

Reputation: 1169

Strange behaviour when comparing in SSE

I don't get the error in my code. I try to compare a buffer of unsigned char values to a constant. Then I want to store 1 or 0 depending on the comparison. Here is my code (in a structure):

void operator()(const uint8* src, int32 swidth, int32 sheight, uint8* dst, uint8 value) {
   uint8 t[16];
   __m128i v_one = _mm_set1_epi8((uint8)1);
   __m128i v_value = _mm_set1_epi8(value);

   printf("value = %d\n", value);
   SHOW(t, v_one);
   SHOW(t, v_value);
   std::cout << "****" << std::endl;

   for (int32 i = 0; i < sheight; ++i) {
      const uint8* sdata = src + i * swidth;
      uint8* ddata = dst + i * swidth;
      int32 j = 0;
      for ( ; j <= swidth - 16; j += 16) {
         __m128i s = _mm_load_si128((const __m128i*)(sdata + j));
         __m128i mask = _mm_cmpgt_epi8(s, v_value);

         SHOW(t, s);
         SHOW(t, mask);
         std::cout << std::endl;
      }
   }
}

My first line are what I would expect:

value = 100
  1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1
100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100

But then my comparison are wrong:

214 100 199 203 232  50  85 195  70 141 121 160  93 130 242 233
  0   0   0   0   0   0   0   0   0   0 255   0   0   0   0   0

And I really don't get where the mistakes are.

The SHOW macro is:

#define SHOW(t, r)                  \
  _mm_storeu_si128((__m128i*)t, r); \
  printf("%3d", (int32)t[0]);       \
  for (int32 k = 1; k < 16; ++k)    \
    printf(" %3d", (int32)t[k]);    \
  printf("\n")

Upvotes: 1

Views: 117

Answers (1)

Mike Vine
Mike Vine

Reputation: 9852

You are comparing the elements in your s array with your value array.

All the values in the value array are 100. You have a mix of values in your s array.

However, _mm_cmpgt_epi8 works on signed values and as these are bytes it considers values from -128 to +127.

So the only possible values that are > 100 are values in the range 101 to 127.

As you've only got 1 value in that range (121) thats the only one which has its mask set.

To see this, change uint8 t[16]; to int8 t[16]; and you should get a more expected result.

Upvotes: 4

Related Questions