ruhig brauner
ruhig brauner

Reputation: 963

DSP performance, what should be avoided?

I am starting with dsp programming right now and am writing my first low level classes and functions. Since I want the functions to be fast (or at last not inefficient), I often wonder what I should use and what I should avoid in functions which get called per sample.

I know that the speed of an instruction varies quite a bit but I think that some of you at least can share a rule of thumb or just experience. :)

conditional statements

If I have to use conditions, switch should be faster than an if / else if block, right? Are there differences between using two if-statements or an if-else? Somewhere I read that else should be avoided but I don't know why.

Also, compared to a multiplication, is there a rude estimation how much more time an if-block takes? Because in some cases, using multiplications by zero could be used instead of if-statements:

//something could be an int either 1 or 0:
if(something) {
    signal += something_else;
}
// or:
signa+ += something*something_else;

functions and function-pointers

Instead of using conditional statements, you could use function-pointer. Instead of using conditions in every call, the pointer could be redirected to a specific function. However, for every call, the pointer had to be interpreted in order to call the right function. So I don't know if this would help or not.

What I also wonder is if calling functions have an impact. If so, boxing functions should be avoided, right?

variables

I would think that defining and using many variables in a function doesn't realy have an impact, at least relative to calculations. Is this true? If not, reusing declared variables would be better than more declaration.

calculations Is there an order of calculation-types in term of the time they take to execute? I am sure that this highly depends on the context but a rule of thumb would be nice. I often read that people only count the multiplication in an algorithm. Is this because additions are realtively fast? Does it make a difference between multiplication and division? (*0.5 or /2.0)

I hope that you can share soem experience.

Cheers

Upvotes: 4

Views: 936

Answers (2)

shoham
shoham

Reputation: 281

here are part of the answers:

calculations (talking about native precision of the processor for example 32bits):

  • Most DSP microprocessors have single cycle multipliers, that means a multiply costs exactly the same as an addition in term of cycles.
  • and multiplication it generally faster then division.

conditional statements:

if/else - when looking in the assembly code you can see that the memory of the if condition is usually loaded by default, so when using if else make sure that the condition that will happen more frequently will be in the if.

but generally if possible you should avoid if/else in a loop to improve the pipe lining.

good luck.

Upvotes: 2

barak manos
barak manos

Reputation: 30146

DSP compilers are typically good at optimizing for loops that do not contain function-calls.

Therefore, try to inline every function that you call from within a time-critical for loop.

If your DSP is a fixed-point processor, then floating-point operations are implemented by SW.

This means that every such operation is essentially replaced by the compiler with a library function.

So you should basically avoid performing floating-point operations inside time-critical for loops.

The preprocessor should provide a special #pragma for the number of iterations of a for loop:

  • Minimum number of iterations
  • Maximum number of iterations
  • Multiplicity of the number of iterations

Use this #pragma where possible, in order to help the compiler to perform loop-unrolling where possible.

Finally, DSPs usually support a set of unique operations for enhanced performance.

As an example, consider _dotpu4 on Texas Instruments C64xx, which computes the scalar-product of two integers src1 and src2: For each pair of 8-bit values in src1 and src2, the 8-bit value from src1 is multiplied with the 8-bit value from src2, and the four products are summed together.

Check the data-sheet of your DSP, and see if you can make use of any of these operations.

The compiler should generate an intermediate file, which you can explore in order to analyze the expected performance of each of the optimized for loops in your code.

Based on that, you can try different assembly operations that might yield better results.

Upvotes: 1

Related Questions