Mihai Bişog
Mihai Bişog

Reputation: 1018

Is there any speed difference between calling `n` functions in a loop and doing `n` loops with 1 function call each?

Consider the following two example samples:

for (int i = 0; i < someValue; ++i) {
    function1(i);
    function2(i);
}

vs

for (int i = 0; i < someValue; ++i) {
    function1(i);
}

for (int i = 0; i < someValue; ++i) {
    function2(i);
}

Would there be any performance gains from using one form over the other? (I) consider comparisons as being extremely cheap to perform.

Upvotes: 2

Views: 65

Answers (2)

ChileAddict - Intel
ChileAddict - Intel

Reputation: 642

With two loops you are adding twice the number of checks the compiler must do (checking the value of i at every iteration twice instead of once.) Plus additional context switching (causing your pipeline to be flushed more often.)

If your compiler does loop unrolling then I would think you wouldn't see much of a difference. If your system has hyperthreading technology and the functions use different resources (memory, int, float vs. all int or all mem or all float) having them in the same loop would be be faster.

The amount of performance difference is going to depend on what compiler options you are using, what resources your functions are requiring, depth of pipeline, and basic architecture of your system.

If you do not use any performance tuning tricks, the single loop should run faster. Although, unless the upper loop index is a huge value, the difference in performance may be negligible (if the single loop is "faster".) But the answer is really, "It depends" on a lot of things.

Upvotes: 2

Ingo
Ingo

Reputation: 36329

As always, it depends. This time, it depends on the functions themselves.

There is a scenario where it is possible that the 2-loop variant is faster. This is when function2 causes all data that function1 needs to get overwritten in the CPU cache and vice versa. This way, each time one of the functions is called, it will need to re-establish their data in the cache.

OTOH, with the 2 loops, the data for function1 remains in the cache during the first loop, and the data for function2 remains in the cache during the second loop. This will speed up things noticeably. On the down side is the cost for jumping and index increment and checking, but this is likely not noticeable.

Upvotes: 4

Related Questions