Fish
Fish

Reputation: 1265

Is generation of three address code compulsory in a compiler?

Is generation of three address code compulsory in a compiler? My compiler already converts expressions to AST.

Upvotes: 6

Views: 964

Answers (1)

Ira Baxter
Ira Baxter

Reputation: 95362

No, you don't need 3 address code. (Its pretty useful).

You can generate code "on-the-fly" directly as you parse, even without building an AST if you try hard enough. It is very hard to generate really good code this way, but with a good peephole optimizer, one can get surprisingly good results. (See Jack Davidson's work on code generation by peephole optimization: http://dl.acm.org/citation.cfm?id=357098)

You can build code generators that work directly from the AST. You can generate better code than on-the-fly, because the code generator can see forward and backwards in the tree, and annotate the tree with hints about what one part of the code generator expects of another.

One can also go pure transformational, and apply tree-rewriting to transform the AST for the original program, incrementally into an AST for the object code. Since tree-rewrites are essentially generalizations of string rewrites, they are Turing capable an in theory you can generate arbitrarily good code this way. I think LISP compilers often do this.

Three-address code is a way commonly used by compilers that are going to do code optimizations. What it really represents is essentially dataflow between the abstract computations the user program implies need to be done. This dataflow tells the compiler which computations depend on each other, and which ones are independent, so it can optimize on things it recognizes, and be assured that the optimizations on one part don't break another. One can pick other data flow representations or variants and do just as well.

It is fundamental to doing really good optimizations on procedural programs with side effects that one be able to identify the dataflows. Three-address code identifies them directly. The other techniques I mentioned, typically don't represent them directly, but have to compute them implicitly if they want to use that information, and that's actually pretty awkward. So, three address code is commonly used.

Upvotes: 6

Related Questions