Reputation: 11
Can somebody explain to me how the ARM NEON instruction VQABS operates. Per the documentation:
"VQABS returns the absolute value of each element in a vector. If any of the results overflow, they are saturated and the sticky QC flag (FPSCR bit[27]) is set."
If I apply this on a uint16x8 QWORD type then how does the processor determine that there is an overflow? I am puzzled because there is NO operation being performed other than finding the absolute value of a 16-bit data which surely cannot "overflow".
Upvotes: 1
Views: 304
Reputation: 6354
8bit: -128 ~ +127
16bit: -32768 ~ +32767
....
As you can see, the biggest negative number is always bigger than the positive one by 1.
If you use vabs on -128 (0x80) for example, the return value is the same 0x80 which can be interpreted both way: -128 as well as +128, depending on the signness.
It might be fine as long as you interpret the result as unsigned which is mostly the case. However, 0x80 needs one more bit than 0x7f (127), thus overflow might occur in following arithmetics that expect a 7-bit input.
Or you could be simply forced to do multiply the result with a signed vector then you would be in serious trouble since 0x80 will be interpreted as -128 by a signed multiply.
vqabs returns 0x7f instead of 0x80. Problem solved.
Upvotes: 1