Alvein
Alvein

Reputation: 207

Converting function from recursive to iterative

-There's a <TL;DR> section down below-

Hello,

Let's say I need an algorithm to process every node in a tree, having into account that:

  1. The tree is HUGE
  2. Processing each node is TIME CONSUMING.

This is the basic** traverse recursive function:

function process_branch(node)
    process_node(node)
    for each c in node.children
        process_branch(c)
    end for
end function

Now, this is the iterative version, using my own stack:

function process_branch(node)
    stack.push(node)
    while not stack.empty
        n=stack.pop()
        process_node(n)
        for each c in n.children
            stack.push(c)
        end for
    end while
end function

**All the algorithms I've placed here are extremely simplified versions of the actual ones.

So far, good.

More to consider:

But then, I need a way of calling the function again, resuming the processing and ignoring the already complete branches.

-<TL;DR> section starts here-

This recursive version works marvels:

function process_branch(node)
    if not node.completed
        process_node(node)
        for each c in node.children
            process_branch(c)
        end for
        node.completed=true
    end if
end function

Notice how a branch's root node is only marked as completed when all their children are also completed. So, every time the function is restarted, a lot of work is saved.

My problem is, I've not been able to create an iterative version of this new function.

This is the iterative version I have right now:

function process_branch(node)
    stack.push(node)
    while not stack.empty
        n=stack.pop()
        if not n.completed
            process_node(n)
            for each c in n.children
                stack.push(c)
            end for
            n.completed=true
        end if
    end while
end function

Notice now, how a given node is marked as completed even before its children branches are actually processed.

With recursion, the n.completed=true statement was delayed until every process_branch(c) for its children (and children of children, etc.) was finished.

<TL;DR> section ends here

So, if the function is stopped in the middle of this, there will be some whole branch marked as completed (which is false), and it will be ignored in the next function restart.

How could I implement the delayed n.completed=true in the iterative version?

I can't see it. I hope it's just a mind block.

Thanks for your help!

EDIT:

Thanks to the suggestions about playing with the stack, I found a simple solution, which basically requires two things:

  1. Pushing the parent node (again) before its children, so after each child is processed, the main loop can return to it, and set it as completed.
  2. Introducing a new status/property for each node ("processed"), for telling apart processed nodes and skip already completed work. Notice that this is required because the processing of each node happens as soon as it's reached. Or in other words, a node is processed long before its branch is completed.
function process_branch(node)
    stack.push(node)
    while not stack.empty
        n=stack.pop()
        if not n.completed
            if not n.processed
                process_node(n)
            end if
            t=0
            for each c in n.children
                if not c.completed
                    t=t+1
                end if
            end for
            if t=0
                n.completed=true
            else
                stack.push(n)
            end if
            for each c in n.children
                if not c.completed
                    stack.push(c)
                end if
            end for
        end if
    end while
end function

Hope this helps somebody in the future.

Upvotes: 0

Views: 81

Answers (1)

DarkSquirrel42
DarkSquirrel42

Reputation: 10267

you are using a stack to store "what to do next"

instead of just storing the nodes, maybe store what to do with them too...

function process_branch(node)
    stack.push(new Job(node,true)) // true for processing a node
    while not stack.empty
        j=stack.pop()
        if not j.n.completed
            if (j.process) // test if processing or just marking complete
                process_node(j.n)
            else
                j.n.completed=true
            end if
            for each c in n.children
                stack.push(new Job(c,true)) // true for processing a node
            end for
            stack.push(new Job(n,false)) // false for marking complete
        end if
    end while
end function

you will want to rearange it to enable restarting the processing after you aborted... i'm kind of in a hurry right now, so it's just solving the problem of when a node becomes complete for now, maybe that's enough already...

Upvotes: 1

Related Questions