San_kim
San_kim

Reputation: 41

How to add scalar in neon?

I want to do addition using scalar. Here is what I've tried:

ex) uint32x4_t result, result2, op, one;

// op + 1

result = vaddq_u32(op, 1); //error, 1 is not vector

one = vdupq_n_u32(1);

result2 = vaddq_u32(op, one); // ok

What is the best way to save memory space when doing this?

Upvotes: 4

Views: 1948

Answers (1)

There are no instructions for vector-scalar alu type operations, only multiplications of >= 16bit width on NEON.

Neither are there instructions for add/sub by immediate values.

What you already did is the way it is supposed to be done.

One thing you could try to boost the performance is to declare the vector of 1s as a constant outside of the loop, hoping the compiler to be smart enough not to load the same value over and over each iteration within the loop.

Unfortunately, the available ARM compilers aren't that reliable when in comes to NEON. Checking the disassembly is pretty much a necessety which defeats the point of writing in intrinsics in the first place.

Upvotes: 3

Related Questions