Wolfrum
Wolfrum

Reputation: 73

NEON Fixed point coding and Fixed vs Floating point operations performance comparison

As we can see here "arm integer NEON operations cycles " and arm float NEON operations cycles ,the integer Multiply operations does not seem to have a definite advantage over the Floating point Multiplication operations. When I converted my floating point code to fixed point, I had to add additional "shift "instruction after fixed point multiplication/division instructions. The cycles required for the program actually increased due to increase in the instructions. The performance of my program deteriorated due to Fixed point. (14000 -cycles for floating point code, 26000-cycles for fixed point code).

Are there any special instructions dedicated NEON to fixed point operations(Multiplications and divisions) ? I only found one instruction that just converts Fixed -float and otherwise. Is there any efficient way of writing fixed point programs in NEON?

I wrote the following sample code for floating point code.

    VMUL   Q14.F32,Q8.F32,Q2.F32
    VMUL   Q15.F32,Q8.F32,Q3.F32
    VLD2    {Q10.F32,Q11.F32},[pTw2@256],TwdStep
    VLD2    {Q4.F32,Q5.F32},[pT1@256],fftSize
    VMLA   Q14.F32,Q9.F32,Q3.F32
    VMLS   Q15.F32,Q9.F32,Q2.F32

The following code was converted to Fixed point code by inserting shift operations after VMUL A instructions.

    VMUL   Q14.S32,Q8.S32,Q2.S32
   VMUL   Q15.S32,Q8.S32,Q3.S32
   VLD2    {Q10.S32,Q11.S32},[pTw2@256],TwdStep
   VLD2    {Q4.S32,Q5.S32},[pT1@256],fftSize
   VMLA   Q14.S32,Q9.S32,Q3.S32
   VMLS   Q15.S32,Q9.S32,Q2.S32

   VRSHR    Q14.S32,Q14.S32,#12     ;Shift instructions to account for fixed point 
   VRSHR    Q15.S32,Q15.S32,#12     ;

Upvotes: 3

Views: 2964

Answers (1)

auselen
auselen

Reputation: 28087

See Vector Floating Point Instruction Set Quick Reference Card for the set of NEON instructions. There is no dedicated fixed point instructions.

I suggest you to read blog.arm.com post titled Coding for NEON - Part 3: Matrix Multiplication / Fixed Point, it may give you some ideas to try things.

It claims:

Using fixed point arithmetic for calculations is often faster than floating point – it requires less memory bandwidth to read and write values that use fewer bits, and multiplication of integer values is generally quicker than the same operations applied to floating point numbers.

However, when using fixed point arithmetic, you must choose the representation carefully to avoid overflow or saturation, whilst preserving the degree of precision in the results that your application requires.

Upvotes: 2

Related Questions