paul
paul

Reputation: 257

Loading 8-bit values using NEON/ARM

I'm trying to load an array of char values into NEON registers, and then treat them as 16-bit or 32-bit integer values. So something like this...

void SubVector(short* c, const unsigned char* a, const unsigned char* b, int n)
{
    for(int i = 0; i < n; i++)
    {
        c[i] = (short)a[i] - (short)b[i];
    }
}

I'm not sure how to load the data. Should I load the 8-bit data into lanes, and then reinterpret the registers as shorts? Or load and convert? What would be the fastest way?

Does anyone have a example on how they would do this with NEON intrinsics?

Thanks!

Upvotes: 2

Views: 3603

Answers (1)

BitBank
BitBank

Reputation: 8725

NEON has addition and subtraction instructions that can widen values from 8->16, 16->32 or 32->64 bits. You can do 8 at a time like this:

uint8x8_t u88_a, u88_b;
uint16x8_t u168_diff;

u88_a = vld1_u8(a); // load 8 unsigned chars from a[]
u88_b = vld1_u8(b); // load 8 unsigned chars from b[]
u168_diff = vsubl_u8(u88_a, u88_b); // calculate the difference and widen to 16-bits

Upvotes: 6

Related Questions