Jason S
Jason S

Reputation: 189626

How do you include subroutine calls in a control flow graph?

I get the idea of a control flow graph; it involves nodes that are basic blocks (sequences of operations that always occur), connected by edges that represent jumps.

But how do you represent a subroutine call?

If I have two functions like this:

int tweedledee(void)
{
    int x = 16;
    return x + do_something();
}

int tweedledum(int n)
{
    if (n < 0)
       return n;
    else
       return n + do_something();
}

with both functions calling do_something(), then I need a way to allow a jump from a block in tweedledee to do_something and then another jump back to tweedledee, and a jump from a block in tweedledum to do_something and then back to tweedledum, but there's never a jump from tweedledee to do_something and then to tweedledum. (Or tweedledumdo_somethingtweedledee) So it seems like a plain directed graph wouldn't suffice to define these relationships... maybe I'm missing something.

Upvotes: 2

Views: 1647

Answers (1)

Pachelbel
Pachelbel

Reputation: 533

Procedures make CFGs and static analysis in general quite complicated. There are different approaches for representing routine calls in control flow graphs.

One first and common solution is to create a CFG for each routine, and to split the "call nodes" (the basic block corresponding to "CALL do_something()" in tweedledee for example) into two nodes, the actual call block C and a block R for writing the return value.

An edge (normally of special type) is inserted between C and the initial block of the routine being called, and another one between the end block of the routine being called and R. A simple example:

void foo() { int x = bar(); }
int bar() { return 1; }

might be transformed to:

[init::foo] ==> [CALL bar()]  [x = RETVAL(bar())] ==> [end::foo]
                     ||            /\
                     \/            ||
                [init::bar] ==> [ret = 1 (end::bar)]

If there is another call to bar(), e.g. from a routine

void foo2() { int y = bar(); }

then the following graph might be the result:

 [init::foo] ==> [CALL bar()]  [x = RETVAL(bar())] ==> [end::foo]
                     ||            /\
                     \/            ||
                [init::bar] ==> [ret = 1 (end::bar)]
                     /\            ||
                     ||            \/
 [init::foo2]==> [CALL bar()]  [x = RETVAL(bar())] ==> [end::foo2]

The problem here: There are now paths in this graph (e.g. init::foo ==> CALL bar() ==> init::bar ==> ret = 1 ==> x = RETVAL(bar()) ==> end::foo2) which make no sense in the program. This is what you might mean by wondering whether a "plain directed graph" suffices. There are different approaches for this problem, e.g.: Make copies of a routine (here bar) for each invocation of it. This only helps if there is no real recursion, and is expensive in general. For static analysis it is often useful to "overapproximate" the number of paths by only using a fixed number of such copies.

The slides Interprocedural Analysis Lecture Notes (Stephen Chong) seem to be a good introduction. There are also quite good books about constructing such graphs, e.g. Principles of Program Analysis by Nielson et al.

Upvotes: 3

Related Questions