Daniel
Daniel

Reputation: 449

Understand JavaScript Recursion and Call Stack in Depth First Traversal

I have a perfect balanced binary tree:

          1
        /   \
       2     5
      / \   / \
     3   4 6   7

A simple node structure with DF traversal:

function TreeNode(val){
  this.val = val
  this.left = this.right = null
  this.addLeft = function(node){
    this.left = node
  }
  this.addRight = function(node){
    this.right = node
  }
}

function visit(node){
  console.log(node.val)
}

function traverse(node){
  if (node == null) return

  visit(node)

  if(node.left){
    traverse(node.left)
  }
  if(node.right){
    traverse(node.right)
  }
}

I create nodes to represent a tree structure:

let rootNode = new TreeNode(1)
let node_A = new TreeNode(2)
let node_B = new TreeNode(5)
let node_C = new TreeNode(3)
let node_D = new TreeNode(4)
let node_E = new TreeNode(6)
let node_F = new TreeNode(7)
rootNode.addLeft(node_A)
rootNode.addRight(node_B)
node_A.addLeft(node_C)
node_A.addRight(node_D)
node_B.addLeft(node_E)
node_B.addRight(node_F)

Calling traverse(rootNode) prints out correctly:

1
2
3
4
5
6
7

I understand how recursion works, but am still a little bit confused how JavaScript handles it in the call stack. traverse(rootNode) is put in the call stack first, then it reaches the if condition, inside here it checks that rootNode has a node left, so it continues down the path until it reaches the final node, which is TreeNode(3). The call stack is like this:

|                             |
|                             |
| traverse(TreeNode(3))       |
| traverse(TreeNode(2))       |
| traverse(rootNode)          | 
|_____________________________|        |

TreeNode(3) does not have any node.left or node.right so it returns to the if condition, and goes down to check node.right, the second condition. Then it sees indeed TreeNode(2) has a node.right and traverse to TreeNode(4) and push it to the stack.

Now this part is confusing me. How does JavaScript keep track of calling rootNode.right when traverse(TreeNode(4) is completed? In other words, how does it know to switch to the right branch at the end of left branch? Because the output prints 1 to 7, so the call stack must be:

|                             |
| traverse(TreeNode(7))       |
| traverse(TreeNode(6))       |
| traverse(TreeNode(5))       |
| traverse(TreeNode(4))       |
| traverse(TreeNode(3))       |
| traverse(TreeNode(2))       |
| traverse(rootNode)          | 
|_____________________________|

But I believe the top of the stack will be the first to pop out and return, so the output should be starting from 7, not 1. So my second question is why the console log correctly prints out the result from 1 to 7.

Upvotes: 4

Views: 1541

Answers (1)

Mark Adelsberger
Mark Adelsberger

Reputation: 45819

Well, your explanation leaves some important steps out, and I think that's allowing you to reach some invalid conclusions. In particular, the stack never looks like

traverse(TreeNode(7))       
traverse(TreeNode(6))       
traverse(TreeNode(5))       
traverse(TreeNode(4))       
traverse(TreeNode(3))       
traverse(TreeNode(2))       
traverse(rootNode)          

For the record, none of this is javascript-specific; so really I think this is about improving how well you understand recursion.

Let's step through your traverse(rootNode) call in more detail. So of course the call stack starts out like

0: traverse ( node = rootNode ) {
|  =>  if (node == null) return
|      visit(node)
|
|      if(node.left){
|          traverse(node.left)
|      }
|      if(node.right){
|          traverse(node.right)
|_     }

Here I'm using notation that makes a few things more explicit:

  • I put a number before the call; this way, for example, I could refer to the above call as "call 0 on the stack"
  • I show the parameter assignments of each call
  • I show the code of the function behind each call, with an arrow => indicating current progress through the code for this particular call. Here we're about to execute the first instruction.

Following the convention you appear to be using, the most recently-pushed (next-to-pop) item will be shown at the top.

So now since node is rootNode and that's not null, it doesn't just immediately return. Before any recursion starts, it calls visit() on rootNode, which goes ahead and prints 1. This is one of the things that seems missing from your explanation - nodes are visited (and their values printed) before any recursion, so they're being printed as you traverse, as you first reach each node.

Then it checks the left and finds a truthy value (in this case, another node object), so it calls traverse(node.left) and we end up with a stack like

1: traverse ( node = node_A )
|  =>  if (node == null) return
|      visit(node)
|
|      if(node.left){
|          traverse(node.left)
|      }
|      if(node.right){
|          traverse(node.right)
|_     }

0: traverse ( node = rootNode ) {
|      if (node == null) return
|      visit(node)
|
|      if(node.left){
|          traverse(node.left)
|  =>  }
|      if(node.right){
|          traverse(node.right)
|_     }

So again we start executing traverse() from the beginning with our new node value, which is not null so we go ahead and visit(node) - which, since node is node_A, prints 2. Then it checks left, which is truthy (node_C) and we get

2: traverse ( node = node_C )
|  =>  if (node == null) return
|      visit(node)
|
|      if(node.left){
|          traverse(node.left)
|      }
|      if(node.right){
|          traverse(node.right)
|_     }

1: traverse ( node = node_A ) {
|      if (node == null) return
|      visit(node)
|
|      if(node.left){
|          traverse(node.left)
|  =>  }
|      if(node.right){
|          traverse(node.right)
|_     }

0: traverse ( node = rootNode ) {
|      if (node == null) return
|      visit(node)
|
|      if(node.left){
|          traverse(node.left)
|  =>  }
|      if(node.right){
|          traverse(node.right)
|_     }

node_C isn't null, so it gets visit()ed and this prints 3. Now we check left, but it's falsey (undefined). So we check right and it, too, is falsey (undefined). So we return and now the stack says

1: traverse ( node = node_A ) {
|      if (node == null) return
|      visit(node)
|
|      if(node.left){
|          traverse(node.left)
|  =>  }
|      if(node.right){
|          traverse(node.right)
|_     }

0: traverse ( node = rootNode ) {
|      if (node == null) return
|      visit(node)
|
|      if(node.left){
|          traverse(node.left)
|  =>  }
|      if(node.right){
|          traverse(node.right)
|_     }

Now we're already midway through the traverse() code for call 1 on the stack, so we pick up where we left off (at the =>). So now we check right, which is truthy and we get

2: traverse ( node = node_D )
|  =>  if (node == null) return
|      visit(node)
|
|      if(node.left){
|          traverse(node.left)
|      }
|      if(node.right){
|          traverse(node.right)
|_     }

1: traverse ( node = node_A ) {
|      if (node == null) return
|      visit(node)
|
|      if(node.left){
|          traverse(node.left)
|      }
|      if(node.right){
|          traverse(node.right)
|_ =>  }

0: traverse ( node = rootNode ) {
|      if (node == null) return
|      visit(node)
|
|      if(node.left){
|          traverse(node.left)
|  =>  }
|      if(node.right){
|          traverse(node.right)
|_     }

Now visit will print 4 (because node is node_D, and both left and right are falsey, so we return to

1: traverse ( node = node_A ) {
|      if (node == null) return
|      visit(node)
|
|      if(node.left){
|          traverse(node.left)
|      }
|      if(node.right){
|          traverse(node.right)
|_ =>  }

0: traverse ( node = rootNode ) {
|      if (node == null) return
|      visit(node)
|
|      if(node.left){
|          traverse(node.left)
|  =>  }
|      if(node.right){
|          traverse(node.right)
|_     }

But call 1 on the stack has reached the end of it's code, so it returns and we go to

0: traverse ( node = rootNode ) {
|      if (node == null) return
|      visit(node)
|
|      if(node.left){
|          traverse(node.left)
|  =>  }
|      if(node.right){
|          traverse(node.right)
|_     }

Call 0 picks up where it left off, which means it's about to check right (which, I think, answers your original question). Execution proceeds down the right branch in exactly the same way it proceeded down the left, with the stack at various times containing calls for rootNode, node_B, then rootNode, node_B, node_E, then rootNode, node_B again (but with the code partially executed), then rootNode, node_B, node_F, then one last time rootNode, node_B (with the code about to finish), then back to rootNode, and then the initial call to traverse finally returns.

Upvotes: 3

Related Questions