Matt
Matt

Reputation: 22133

How can I reduce the stack frame of a deeply recursive function in C?

Suppose I have some recursive function that manipulates a graph structure:

typedef struct Node {
    Data data;
    size_t visited;
    size_t num_neighbors;
    struct Node *neighbors[];
} Node;

void doSomethingWithNode(Node *node)
{
    if ((!node) || node->visited)
        return;
    node->visited = 1;
    /* Do something with node->data */
    size_t /* some local variables */;
    size_t i;
    for (i = 0; i < node->num_neighbors; i++)
    {
        /* Do some calculations with the local variables */
        if (/* something */)
            doSomethingWithNode(node->neighbors[i]);
    }
}

Because of the local variables I use in the loop, the compiler (gcc) creates a larger-than-I-would-like stack frame for this function (a good number of pushq and popq instructions even with -O3), which is a problem, since it is deeply recursive. Since it doesn't matter what order I visit the nodes in, I could refactor this code to use a stack of Node pointers, thus reducing the overhead to one pointer per iteration.

  1. Are there any hints I can give the compiler (gcc) to fix this problem?
  2. If not, is it possible to make use of the call stack itself for my stack of pointers without resorting to assembly?

Upvotes: 9

Views: 1677

Answers (8)

4566976
4566976

Reputation: 2499

Put all local variables which are not essential for recursion into struct locals and access them with plocals->. The advantage over putting the calculations into their own, non-recursive function (Arkadiy's answer) is, if needed, that the variables are valid and retain their values over the recursions.

#include <stddef.h>

struct Data {
    char data[1];
};

typedef struct Node {
    struct Data data;
    size_t visited;
    size_t num_neighbors;
    struct Node *neighbors;
} Node;

struct Locals {
    /* local variables not essential for recursion */;
};
static void doSomethingWithNodeRecurse(Node *node, struct Locals *plocals)
{
    if ((!node) || node->visited)
        return;
    node->visited = 1;
    /* Do something with node->data */
    /* local variables essential for recursion */
    size_t i;
    for (i = 0; i < node->num_neighbors; i++)
    {
        /* Do some calculations with the local variables */
        if (1/* something */)
            doSomethingWithNodeRecurse(&node->neighbors[i], plocals);
        /* Do some calculations with the local variables */
    }
}

void doSomethingWithNode(Node *node)
{
    struct Locals locals;

    doSomethingWithNodeRecurse(node, &locals);
}

If the variables are still too huge to allocate them on the stack, they can be allocated on the heap as Vagish suggested:

#include <stddef.h>
#include <stdlib.h>

struct Data {
    char data[1];
};

typedef struct Node {
    struct Data data;
    size_t visited;
    size_t num_neighbors;
    struct Node *neighbors;
} Node;

struct Locals {
    /* local variables too big for allocation on stack */;
};
void doSomethingWithNode(Node *node)
{
    struct Locals *plocals;

    if ((!node) || node->visited)
        return;

    /* ---> allocate the variables on the heap <--- */
    if ((plocals = malloc(sizeof *plocals)) == NULL) abort();

    node->visited = 1;
    /* Do something with node->data */
    size_t i;
    for (i = 0; i < node->num_neighbors; i++)
    {
        /* Do some calculations with the local variables */
        if (1/* something */)
            doSomethingWithNode(&node->neighbors[i]);
        /* Do some calculations with the local variables */
    }
    /* ---> free the variables <--- */
    free(plocals);
}

Upvotes: 1

Paul Ogilvie
Paul Ogilvie

Reputation: 25286

I find many if the other answers not elegant and requiring much overhead. Probably there is no good way and any way depends on the type of recursion at hand.

In your case, the recursion is at the end and only variable i is still needed. To reduce the stack frame, you can use for al other variables global space.

If you want to reduce even more and remove i, you can use node->visisted as a counter:

static struct VARS {
    int iSomething;
    Data *dataptr;
    double avg;
} gVars;

void doSomethingWithNode(Node *node)
{
    if ((!node) || node->visited)
        return;
    /* Do something with node->data */
    /* some local variables in global space */;
    gVars.iSomething= 1;
    for (; node->visited < node->num_neighbors; node->visited++)
    {
        /* Do some calculations with the local variables */
        if (/* something */)
            doSomethingWithNode(node->neighbors[node->visited]);
    }
}

Upvotes: 1

chux
chux

Reputation: 154087

"it is deeply recursive" is a hint that the deepest recursions occur in paths that do not have more than 1 non-visited neighbor.

Let code only recurse when there is more than 1 interesting neighbor, otherwise just loop.

void doSomethingWithNode(Node *node) {
  while (node) {
    if (node->visited) return;
    node->visited = 1;
    /* Do something with node->data */
    size_t /* some local variables */;
    size_t i;
    Node *first = NULL;
    for (i = 0; i < node->num_neighbors; i++) {
        /* Do some calculations with the local variables */
        if (/* something */) {

          // Save the first interesting node->neighbors[i] for later
          if (first == NULL && 
              node->neighbors[i] != NULL && 
              node->neighbors[i]->visited == 0) {
            first = node->neighbors[i];

         } else {
            doSomethingWithNode(node->neighbors[i]);
          }
        }
    }
    node = first;
  }
}

This does not reduce the stack frame, but eliminate recursion when there is only 1-ply. IOWs: when recursion is not needed.

The recursion depth should now no longer exceed O(log2(n)) instead of the original worst-case O(n)

Upvotes: 2

user3458
user3458

Reputation:

Can you put the calculations into their own, non-recursive function? That way the stack for all the temporary variables will not be there when you make the recursive call.

Update: looks like at least some data in the local variables is essential for recursion. You can use alloca to explicitly allocate memory on the stack.

Upvotes: 3

Vagish
Vagish

Reputation: 2547

If you have more number of local variables and arrays then you can try allocating memory using malloc,manipulate it using single pointer and fixed offsets .free the memory while exiting from function.

In this way you will save the stack and reuse same heap(maybe) section for all the iterations.

Upvotes: 1

You could maintain a vector or a list (or some queue, or perhaps a stack, or even some arbitrary unordered set) of nodes to be visited (and you probably want to maintain a set or hash table of already visited nodes).

Then you'll have a loop which picks the node in front of the to-be-visited container, and might add some unvisited nodes in the back of that container....

Read the wikipages about continuation passing style and about tail calls

Google also for Deutsch Schorr Waite Algorithm, it could give you some ideas.

Upvotes: 4

Joey Adams
Joey Adams

Reputation: 43390

You can use malloc and realloc to manage a dynamically-growing stack of nodes. Here's a "class" for managing the stack:

typedef struct Stack {
    void **pointers;
    size_t count;
    size_t alloc;
} Stack;

void Stack_new(Stack *stack)
{
    stack->alloc = 10;
    stack->count = 0;
    stack->pointers = malloc(stack->alloc * sizeof(void*));
}

void Stack_free(Stack *stack)
{
    free(stack->pointers);
    stack->pointers = null;
}

void Stack_push(Stack *stack, void *value)
{
    if (stack->alloc < stack->count + 1) {
        stack->alloc *= 2;
        stack->pointers = realloc(stack->pointers, stack->alloc * sizeof(void*));
    }
    stack->pointers[stack->count++] = value;
}

void *Stack_pop(Stack *stack)
{
    if (stack->count > 0)
        return stack->pointers[--stack->count];
    return NULL;
}

Upvotes: 2

unwind
unwind

Reputation: 400009

What would you expect the compiler to do in order to solve the problem?

You can of course go through your code, and minimize the number of local variables, try to make it as clear as possible that they are (for instance) only assigned to once by using const when possible, and so on. This might make the compiler re-use the space, if possible.

Failing that, you can probably save some memory by going iterative instead since that cuts out the need for the return address.

Upvotes: 2

Related Questions