Reputation: 3454
I am new to LLVM and Intermediate Representation (IR), and I am trying to understand how PHI nodes are handled in LLVM IR. I understand that PHI nodes are a fundamental component of SSA (Static Single Assignment) form in LLVM IR, and are used to represent control flow in a program.
However, I am not sure if PHI nodes remain in LLVM IR until compilation to binary. Are all optimizations in LLVM's optimization pipeline designed to work with PHI nodes and SSA form, or are there cases where PHI nodes need to be eliminated or modified before optimization can take place?
I would appreciate any insights or clarifications on this topic. Thank you!
Upvotes: 0
Views: 312
Reputation: 21878
LLVM compilation pipeline consists of dozens of separate transformations (called passes) which can roughly be split to several main phases:
As you can see, the last three phases do not use SSA (they use copies instead of PHI instructions).
Upvotes: 3
Reputation: 9675
Write this on your blackboard: Nothing in a compiler is simple. If you have no blackboard, write it on your whiteboard, forearm or the door of a convenient bathroom stall.
SSA is extremely convenient for algorithms that reason about code, which includes practically all optimisations and all analysis. I would say that SSA is as close to "always" as anything in a compiler ever is.
But of course some optimisations exist that run at a very late stage during compilation, because even though SSA is generally extremely convenient for reasoning, that doesn't make it necessarily the most convenient form for every instance of reasoning. It's close, but…
Suppose a fictional LLVM backend produces the three assembly instructions 'add r1, r2, r3', 'mv r3, r4' and 'add r1, r5, r3' where the destination register is the last one. You may then observe that if the first instruction were changed to 'add r1, r2, r4' then the second one could be removed. This is called peephole optimisation, and some LLVM backends do contain peephole optimisers that work after register allocation. (I'm fairly sure I've seen either the ARM or x86 backend perform peephole optimisation twice, both before and after register allocation. Compilers are never simple.)
So, even though it's not 100% true, you can very nearly say that phi nodes remain in the code until the final native machine code is generated. Beacuse: If someone wants to add any clever analysis, transformation or optimisation, it's nearly guaranteed that they insert the new code before phi nodes are removed and registers are allocated.
Upvotes: 2