Reputation: 73
As we can see here "arm integer NEON operations cycles " and arm float NEON operations cycles ,the integer Multiply operations does not seem to have a definite advantage over the Floating point Multiplication operations. When I converted my floating point code to fixed point, I had to add additional "shift "instruction after fixed point multiplication/division instructions. The cycles required for the program actually increased due to increase in the instructions. The performance of my program deteriorated due to Fixed point. (14000 -cycles for floating point code, 26000-cycles for fixed point code).
Are there any special instructions dedicated NEON to fixed point operations(Multiplications and divisions) ? I only found one instruction that just converts Fixed -float and otherwise. Is there any efficient way of writing fixed point programs in NEON?
I wrote the following sample code for floating point code.
VMUL Q14.F32,Q8.F32,Q2.F32
VMUL Q15.F32,Q8.F32,Q3.F32
VLD2 {Q10.F32,Q11.F32},[pTw2@256],TwdStep
VLD2 {Q4.F32,Q5.F32},[pT1@256],fftSize
VMLA Q14.F32,Q9.F32,Q3.F32
VMLS Q15.F32,Q9.F32,Q2.F32
The following code was converted to Fixed point code by inserting shift operations after VMUL A instructions.
VMUL Q14.S32,Q8.S32,Q2.S32
VMUL Q15.S32,Q8.S32,Q3.S32
VLD2 {Q10.S32,Q11.S32},[pTw2@256],TwdStep
VLD2 {Q4.S32,Q5.S32},[pT1@256],fftSize
VMLA Q14.S32,Q9.S32,Q3.S32
VMLS Q15.S32,Q9.S32,Q2.S32
VRSHR Q14.S32,Q14.S32,#12 ;Shift instructions to account for fixed point
VRSHR Q15.S32,Q15.S32,#12 ;
Upvotes: 3
Views: 2964
Reputation: 28087
See Vector Floating Point Instruction Set Quick Reference Card for the set of NEON
instructions. There is no dedicated fixed point instructions.
I suggest you to read blog.arm.com
post titled Coding for NEON - Part 3: Matrix Multiplication / Fixed Point, it may give you some ideas to try things.
It claims:
Using fixed point arithmetic for calculations is often faster than floating point – it requires less memory bandwidth to read and write values that use fewer bits, and multiplication of integer values is generally quicker than the same operations applied to floating point numbers.
However, when using fixed point arithmetic, you must choose the representation carefully to avoid overflow or saturation, whilst preserving the degree of precision in the results that your application requires.
Upvotes: 2