Reputation: 22133
Suppose I have some recursive function that manipulates a graph structure:
typedef struct Node {
Data data;
size_t visited;
size_t num_neighbors;
struct Node *neighbors[];
} Node;
void doSomethingWithNode(Node *node)
{
if ((!node) || node->visited)
return;
node->visited = 1;
/* Do something with node->data */
size_t /* some local variables */;
size_t i;
for (i = 0; i < node->num_neighbors; i++)
{
/* Do some calculations with the local variables */
if (/* something */)
doSomethingWithNode(node->neighbors[i]);
}
}
Because of the local variables I use in the loop, the compiler (gcc
) creates a larger-than-I-would-like stack frame for this function (a good number of pushq
and popq
instructions even with -O3
), which is a problem, since it is deeply recursive. Since it doesn't matter what order I visit the nodes in, I could refactor this code to use a stack of Node
pointers, thus reducing the overhead to one pointer per iteration.
gcc
) to fix this problem?Upvotes: 9
Views: 1677
Reputation: 2499
Put all local variables which are not essential for recursion into struct locals
and access them with plocals->
. The advantage over putting the calculations into their own, non-recursive function (Arkadiy's answer) is, if needed, that the variables are valid and retain their values over the recursions.
#include <stddef.h>
struct Data {
char data[1];
};
typedef struct Node {
struct Data data;
size_t visited;
size_t num_neighbors;
struct Node *neighbors;
} Node;
struct Locals {
/* local variables not essential for recursion */;
};
static void doSomethingWithNodeRecurse(Node *node, struct Locals *plocals)
{
if ((!node) || node->visited)
return;
node->visited = 1;
/* Do something with node->data */
/* local variables essential for recursion */
size_t i;
for (i = 0; i < node->num_neighbors; i++)
{
/* Do some calculations with the local variables */
if (1/* something */)
doSomethingWithNodeRecurse(&node->neighbors[i], plocals);
/* Do some calculations with the local variables */
}
}
void doSomethingWithNode(Node *node)
{
struct Locals locals;
doSomethingWithNodeRecurse(node, &locals);
}
If the variables are still too huge to allocate them on the stack, they can be allocated on the heap as Vagish suggested:
#include <stddef.h>
#include <stdlib.h>
struct Data {
char data[1];
};
typedef struct Node {
struct Data data;
size_t visited;
size_t num_neighbors;
struct Node *neighbors;
} Node;
struct Locals {
/* local variables too big for allocation on stack */;
};
void doSomethingWithNode(Node *node)
{
struct Locals *plocals;
if ((!node) || node->visited)
return;
/* ---> allocate the variables on the heap <--- */
if ((plocals = malloc(sizeof *plocals)) == NULL) abort();
node->visited = 1;
/* Do something with node->data */
size_t i;
for (i = 0; i < node->num_neighbors; i++)
{
/* Do some calculations with the local variables */
if (1/* something */)
doSomethingWithNode(&node->neighbors[i]);
/* Do some calculations with the local variables */
}
/* ---> free the variables <--- */
free(plocals);
}
Upvotes: 1
Reputation: 25286
I find many if the other answers not elegant and requiring much overhead. Probably there is no good way and any way depends on the type of recursion at hand.
In your case, the recursion is at the end and only variable i is still needed. To reduce the stack frame, you can use for al other variables global space.
If you want to reduce even more and remove i, you can use node->visisted as a counter:
static struct VARS {
int iSomething;
Data *dataptr;
double avg;
} gVars;
void doSomethingWithNode(Node *node)
{
if ((!node) || node->visited)
return;
/* Do something with node->data */
/* some local variables in global space */;
gVars.iSomething= 1;
for (; node->visited < node->num_neighbors; node->visited++)
{
/* Do some calculations with the local variables */
if (/* something */)
doSomethingWithNode(node->neighbors[node->visited]);
}
}
Upvotes: 1
Reputation: 154087
"it is deeply recursive" is a hint that the deepest recursions occur in paths that do not have more than 1 non-visited neighbor
.
Let code only recurse when there is more than 1 interesting neighbor, otherwise just loop.
void doSomethingWithNode(Node *node) {
while (node) {
if (node->visited) return;
node->visited = 1;
/* Do something with node->data */
size_t /* some local variables */;
size_t i;
Node *first = NULL;
for (i = 0; i < node->num_neighbors; i++) {
/* Do some calculations with the local variables */
if (/* something */) {
// Save the first interesting node->neighbors[i] for later
if (first == NULL &&
node->neighbors[i] != NULL &&
node->neighbors[i]->visited == 0) {
first = node->neighbors[i];
} else {
doSomethingWithNode(node->neighbors[i]);
}
}
}
node = first;
}
}
This does not reduce the stack frame, but eliminate recursion when there is only 1-ply. IOWs: when recursion is not needed.
The recursion depth should now no longer exceed O(log2(n)) instead of the original worst-case O(n)
Upvotes: 2
Reputation:
Can you put the calculations into their own, non-recursive function? That way the stack for all the temporary variables will not be there when you make the recursive call.
Update: looks like at least some data in the local variables is essential for recursion. You can use alloca to explicitly allocate memory on the stack.
Upvotes: 3
Reputation: 2547
If you have more number of local variables and arrays then you can try allocating memory using malloc
,manipulate it using single pointer and fixed offsets .free
the memory while exiting from function.
In this way you will save the stack and reuse same heap(maybe) section for all the iterations.
Upvotes: 1
Reputation: 1
You could maintain a vector or a list (or some queue, or perhaps a stack, or even some arbitrary unordered set) of nodes to be visited (and you probably want to maintain a set or hash table of already visited nodes).
Then you'll have a loop which picks the node in front of the to-be-visited container, and might add some unvisited nodes in the back of that container....
Read the wikipages about continuation passing style and about tail calls
Google also for Deutsch Schorr Waite Algorithm, it could give you some ideas.
Upvotes: 4
Reputation: 43390
You can use malloc
and realloc
to manage a dynamically-growing stack of nodes. Here's a "class" for managing the stack:
typedef struct Stack {
void **pointers;
size_t count;
size_t alloc;
} Stack;
void Stack_new(Stack *stack)
{
stack->alloc = 10;
stack->count = 0;
stack->pointers = malloc(stack->alloc * sizeof(void*));
}
void Stack_free(Stack *stack)
{
free(stack->pointers);
stack->pointers = null;
}
void Stack_push(Stack *stack, void *value)
{
if (stack->alloc < stack->count + 1) {
stack->alloc *= 2;
stack->pointers = realloc(stack->pointers, stack->alloc * sizeof(void*));
}
stack->pointers[stack->count++] = value;
}
void *Stack_pop(Stack *stack)
{
if (stack->count > 0)
return stack->pointers[--stack->count];
return NULL;
}
Upvotes: 2
Reputation: 400009
What would you expect the compiler to do in order to solve the problem?
You can of course go through your code, and minimize the number of local variables, try to make it as clear as possible that they are (for instance) only assigned to once by using const
when possible, and so on. This might make the compiler re-use the space, if possible.
Failing that, you can probably save some memory by going iterative instead since that cuts out the need for the return address.
Upvotes: 2