user997112
user997112

Reputation: 30605

Intrinsics example- what is happening here (complete code included)?

I found the below code from:

http://msdn.microsoft.com/en-us/library/bb513993(v=vs.90).aspx

I am trying to understand exactly what the code is doing to then tinker around and suit it to my needs. I am interestd in using intrincs to find space-characters within a string very fast and I think these string intrinsics could help me.

I don't really understand the "commentary" provided on the printf statements, why the expected result is what the author stated?

(You should be able to copy and paste the below and run straight away)

#include <stdio.h>
#include <nmmintrin.h>
#include <iostream>

using namespace std;

int main ()
{
    __m128i a, b;


    const int mode = _SIDD_UWORD_OPS | _SIDD_CMP_EQUAL_EACH | _SIDD_LEAST_SIGNIFICANT;
    //      _SIDD_UWORD_OPS         a and b contain strings of unsigned 16-bit characters.  
    //      _SIDD_CMP_EQUAL_EACH    Find if equal each mode: This implements the string equality algorithm.
    //      _SIDD_LEAST_SIGNIFICANT sets the same bit as _SIDD_BIT_MASK


    a.m128i_u16[7] = 0xFFFF;
    a.m128i_u16[6] = 0xFFFF;
    a.m128i_u16[5] = 0xFFFF;
    a.m128i_u16[4] = 0xFFFF;
    a.m128i_u16[3] = 0xFFFF;
    a.m128i_u16[2] = 0xFFFF;
    a.m128i_u16[1] = 0xFFFF;
    a.m128i_u16[0] = 0xFFFF;

    b.m128i_u16[7] = 0x0001;
    b.m128i_u16[6] = 0x0001;
    b.m128i_u16[5] = 0x0001;
    b.m128i_u16[4] = 0x0001;
    b.m128i_u16[3] = 0x0001;
    b.m128i_u16[2] = 0x0001;
    b.m128i_u16[1] = 0x0001;
    b.m128i_u16[0] = 0x0001;

    int returnValue = _mm_cmpistra(a, b, mode);
    printf_s("_mm_cmpistra return value should be 1: %i\n", returnValue);

    b.m128i_u16[4] = 0x0000;
    returnValue = _mm_cmpistra(a, b, mode);
    printf_s("_mm_cmpistra return value should be 0: %i\n", returnValue);

    b.m128i_u16[5] = 0xFFFF;
    returnValue = _mm_cmpistrc(a, b, mode);
    printf_s("_mm_cmpistrc return value should be 0: %i\n", returnValue);

    b.m128i_u16[4] = 0x0001;
    returnValue = _mm_cmpistrc(a, b, mode);
    printf_s("_mm_cmpistrc return value should be 1: %i\n", returnValue);

    returnValue = _mm_cmpistri(a, b, mode);
    printf_s("_mm_cmpistri return value should be 5: %i\n", returnValue);

    b.m128i_u16[0] = 0xFFFF;
    __m128i fullResult = _mm_cmpistrm(a, b, mode);
    printf_s("_mm_cmpistrm return value: 0x%016I64x 0x%016I64x\n",
                fullResult.m128i_u64[1], fullResult.m128i_u64[0]);

    returnValue = _mm_cmpistro(a, b, mode);
    printf_s("_mm_cmpistro return value should be 1: %i\n", returnValue);

    returnValue = _mm_cmpistrs(a, b, mode);
    printf_s("_mm_cmpistrs return value should be 0: %i\n", returnValue);

    a.m128i_u16[7] = 0x0000;
    returnValue = _mm_cmpistrs(a, b, mode);
    printf_s("_mm_cmpistrs return value should be 1: %i\n", returnValue);

    returnValue = _mm_cmpistrz(a, b, mode);
    printf_s("_mm_cmpistrz return value should be 0: %i\n", returnValue);

    b.m128i_u16[7] = 0x0000;
    returnValue = _mm_cmpistrz(a, b, mode);
    printf_s("_mm_cmpistrz return value should be 1: %i\n", returnValue);

    int bb;
    cin >> bb;

    return 0;
}

Upvotes: 0

Views: 758

Answers (1)

Mike Julier
Mike Julier

Reputation: 161

You've either gone nuts by now, or figured it out. But... The intrinsics that end in 'i' return an index of some occurrence. The intrinsics that end in 'm' return a mask (bit/byte/word depending on...) The intrinsics ending in 'c', 'o', 's', or 'z' return a single EFLAGS bit. The intrinsics ending in 'a' return the Boolean result of CFlag AND ZFlag. (Not the AFlag.) this was done so that there would be an easy way to test for both the occurrence of the event you're looking for as well as the end of one of the inputs with a single instruction. I really did a terrible job of thinking through how to use these instructions in higher-level languages. At the time I was perfectly content dropping into inline ASM to access these instructions. They really are a bit to... complicated... to use efficiently in higher level languages, like C. Sorry about that. If you do drop into ASM, they can be a lot of fun. Kinda cool how much work you can get done with a 3 or 4 instruction loop.

Upvotes: 3

Related Questions