Reputation: 57
I'm trying to translating this C/C++ code to SIMD Intrinsic function.
for(int i=0 ; i < length ; i++)
A[i] = B[C[i]];
I can translate below code (C/C++)
for(int i=0 ; i < length ; i++)
A[i] = B[i];
to SIMD code (using Intrinsic function)
for(int i=0 ; i < length-16 ; i+=16) {
uint8x16_t v0 = vld1q_u8(A+i);
vst1q_u8(A+i, v0);
}
I know that keyword is interleaving to solve this problem. But i can't find solution.
Thanks.
Edit
For more information
unsigned char A [32] = {0,}; // Output Array
unsigned char B [20] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20}; // An array with values to pass to A Array
unsigned int C [32] = {19,15,11,10,5,3,6,4,5,19,10,14,16,14,8,9,10,20,11,1, 0, 3, 5, 19, 20, 11, 13, 9, 30, 31, 7}; // An array with the index information of the B array.
Is there any Intrinsic function that can make the following code form?
int length = 32;
For (int i = 0; i < length-8; i+=8)
{
Uint8x8_t v_idx = vld1_u8 (C + i);
Uint8x8_t v = func (A, v_idx); // func (uint8_t, uint32x4_t)
vst1_u8(C+i, v);
}
Will output 20, 16, 12, 11, 6, 4, 7, 5, 6, 6, 20, 11, 15, 17, 15, 9, 10, 11, 21, 12, 2, 1, 4, 6, 20, 21, 12, 14, 10, 31, 32, 8
[Note]
A and B are uint8_t * types because they are images with values between 0 and 255, and C is a uint32_t * type because they are indexed by B index.
Upvotes: 3
Views: 2002
Reputation: 17502
It's a bit hard to be sure since you didn't provide a lot of information, but vqtbl1_u8
might be what you're looking for. It's AArch64-only, though vtbl1_u8
is available on armv7.
A quick example:
int main (void) {
uint8_t bp[] = { 1, 1, 2, 3, 5, 8, 13, 21 };
uint8_t cp[] = { 0, 2, 4, 6, 1, 3, 5, 7 };
uint8x8_t b = vld1_u8(bp);
uint8x8_t c = vld1_u8(cp);
uint8x8_t a = vtbl1_u8(b, c);
uint8_t ap[8];
vst1_u8(ap, a);
for (int x = 0 ; x < 8 ; x++)
printf("%3u ", ap[x]);
printf("\n");
return 0;
}
Will output 1 2 5 13 1 3 8 21
Upvotes: 4