wefwefa3
wefwefa3

Reputation: 4036

How does "goto" statements affect the "branch prediction" of the CPU?

To learn more about the CPU and code optimization I have started to study Assembly programming. I have also read about clever optimizations like "branch prediction" that the CPU does to speed itself up.

My question might seem foolish since I do not know the subject very well yet.

I have a very vague memory that I have read somewhere (on the internet) that goto statements will decrease the performance of a program because it does not work well with the branch prediction in the CPU. This might however just be something that I made up and did not actually read.

I think that it could be true.

I hope this example (in pseudo-C) will clarify why I think that is so:

int function(...) {
    VARIABLES DECLARED HERE

    if (HERE IS A TEST) {
        CODE HERE ...
    } else if (ANOTHER TEST) {
        CODE HERE ...
    } else {
        /*
        Let us assume that the CPU was smart and predicted this path.
        What about the jump to `label`?

        Is it possible for the CPU to "pre-fetch" the instructions over there?
        */
        goto label;
    }

    CODE HERE...

label:
    CODE HERE...
}

To me it seems like a very complex task. That is because then the CPU will need to look up the place where the goto jumps to inorder to be able to pre-fetch the instructions over there.

Do you know anything about this?

Upvotes: 4

Views: 1956

Answers (2)

user3629249
user3629249

Reputation: 16540

due to 'pipelining' and similar activities, the branch instruction could actually be placed several instructions before the location where the actual branch is to occur. (this is part of the branch prediction logic found in the compiler).

a goto statement is just a jump instruction.

As a side note: Given structured programming concepts, code clarity, readability, maintainability considerations, etc; the 'goto' statement should never be used.

on most CPUs, any jump/call/return type of instruction will flush the prefetch cache then reload that cache from the new location, IF the new location is not already in the cache.

Note: for small loops, which will always will contain 'at least' one jump instruction, many CPUs have an internal buffer that the programmer can exploit to make small loops only perform one prefetch sequence and therefore execute many orders of magnitude faster.

Upvotes: 1

Sneftel
Sneftel

Reputation: 41503

Unconditional branches are not a problem for the branch predictor, because the branch predictor doesn't have to predict them.

They add a bit of complexity to the speculative instruction fetch unit, because the existence of branches (and other instructions which change the instruction pointer) means that instructions are not always fetched in linear order. Of course, this applies to conditional branches too.

Remember, branch prediction and speculative execution are different things. You don't need branch prediction for speculative execution: you can just speculatively execute code assuming that branches are never taken, and if you ever do take a branch, cancel out all the operations from beyond that branch. That would be a particularly stupid thing to do in the case of unconditional branches, but it would keep the logic nice and simple. (IIRC, this was how the first pipelined processors worked.)

(I guess you could have branch prediction without speculative execution, but there wouldn't really be a point to it, since the branch predictor wouldn't have anybody to tell its predictions to.)

So yes, branches -- both conditional and unconditional -- increase the complexity of instruction fetch units. That's okay. CPU architects are some pretty smart people.

EDIT: Back in the bad old days, it was observed that the use of goto statements could adversely affect the ability of the compilers of the day to optimize code. This might be what you were thinking of. Modern compilers are much smarter, and in general are not taken too much aback by goto.

Upvotes: 8

Related Questions