Building CFGs from the AST

Question

As far as I know after the parser builds the AST this structure is transformed into a 'intermediate level' IR, for example three adress code or any other. Then, in order to perfom some analysis this IR is transformed into a control flow graph. My question is, if it is possible going from the AST representation to the CFGs without going through another IR, and then perform, for example, data flow analysis through the CFGs with successfull results?

Ira Baxter · Accepted Answer

You can't construct a CFG without first doing scope and then name resolution.

You need scope resolution to determine the "scope" of implicit control transfers, e.g., the boundaries of an if-then-else, a try-statement, blocks with "break statements", return statements, etc. This is generally fairly easy because scopes tend to follow the syntax structure of the language, which you already have with an AST.

You also need scope resolution to determine where identifiers are defined, if your language allows control transfers to named entities ("goto "), "call ". You can't know which a goto targets without knowing which scope contains the goto, and how scope control the lookup of the named label.

With scope resolution you can implement name resolution; this allows you to assign names to the defining scope containing them, and to attach the definition of the name (e.g., knowing that "goto x" refers to the x in specific scope, and thus to line 750 where x is defined in that scope.

Once you have name resolution (so you can look up the definition of x in "goto x"), now you can construct a control flow graph.

You can do all three of these using attribute grammars, which are essentially computations directly over the AST. So, no, you don't need anything other than the AST to implement these. {You can learn more about attribute computations at my SO answer describing them. Of course, anything you can do with a formal attribute computation you can also do by simply writing lots of recursive procedures that walk over the tree and compute the equivalent result; that's what attribute grammars are compiled to in practice.

Here you can find some details on how to extract the control flow graph by attribute computation

One messy glitch are computations like "goto @" (GCC allows this). To do this right, you need to compute data flow for label variables (which technically require you to construct a CFG first :-( ). You can avoid doing the data flow by constructing a conservative answer: "goto @" can go to any label whose address is taken in the function which the goto is found. You can compute this with an attribute grammar, too.

For languages which have only structured control flow, you can actually implement data flow analysis directly by attribute grammar.

Building CFGs from the AST

Answers (1)

Related Questions