rgamber
rgamber

Reputation: 5849

Binary Tree Path Sum logic

I was trying to understand the logic in the code to solve the Path Sum. This is a solved problem in Gayle Laakmann McDowell's book, though my code looks a little different.

Problem: Given a Binary Tree with each node containing an integer value, count all (downward) paths that sum to a target value.

Code:

private Map<Integer, Integer> map = new HashMap<>();
private int count = 0;

private int pathSum(BinaryTreeNode root, int tgtSum)
{
    map.put(0, 1);
    pathSum(root, 0, tgtSum);
    return count;
}

private void pathSum(BinaryTreeNode root, int runningSum, int tgtSum)
{
    if (root == null) return;

    runningSum += root.data;

    if (map.containsKey(runningSum - tgtSum)) 
    {
        count += map.get(runningSum - tgtSum);
    }

    if (map.containsKey(runningSum))
    {
        map.put(runningSum, map.get(runningSum) + 1);
    }
    else
    {
        map.put(runningSum, 1);
    }

    pathSum(root.left, runningSum, tgtSum);
    pathSum(root.right, runningSum, tgtSum);

    map.put(runningSum, map.get(runningSum) - 1);
}

My issue:

I (kinda) understand the logic where the the runningSum is stored in the map as the key, and the value of (runningSum - tgtSum) must have to exist as one of the keys in the map if some consecutive nodes add up to the tgtSum since that difference will be the runningSum of some other node.

What I am unable to understand is that why can't we replace count += map.get(runningSum - tgtSum) with count++?

I tried debugging it, and I can see that in some cases if the same branch has multiple consecutive nodes adding up to the tgtSum, using count++ counts only once, whereas using the frequency stored in the map the above way gives the correct output.

Though I was able to get it right after spending time debugging, I am not able to connect this part of logic of updating the variable count with the base logic of the problem by using a map with runningSum as the key and the frequency of that running sum as the value.

Can anyone simplify and explain?

Upvotes: 3

Views: 395

Answers (2)

גלעד ברקן
גלעד ברקן

Reputation: 23955

Let's pick the function up somewhere in the middle of execution. Skip the first part where count in incremented. The next two instructions have to do with recording how many of the same runningSums have been recorded: if we've seen it, increment; otherwise, assign one count to the runningSum key.

Now let's go back to the instruction incrementing count. Imagine we've reached runningSum 20 while tgtSum is 12 and there's an 20 - 12 = 8 key already recorded in the map. This means runningSum was 8 at some earlier node on our way to the current location.

                           ...
                           ...
         A  (runningSum 8) •
                          /
          (runningSum 7) • (-1)
                        /
     B  (runningSum 8) • 1
                      / \
     (runningSum 11) • 3 ...
                    /    ...
C  (runningSum 20) • 9
                   ...
                   ...

We need to count both the path from A to C and the path from B to C, so we add the total count so far of runningSum 8 (stored as the value for key 8 in the map), which represents all segments that start at a node where runningSum was 8 and end where we are currently (2 segments in this case).

Now let's say we reach runningSum 20 again somewhere down the right child of B. Again we'll increment count by 2 for the two new segments summing to 12 that we found. After completing both subtrees of B, we decrement the value stored for runningSum 8 since we've counted all valid segments that could start there.

The last instruction, "map.remove(runningSum);" seems wrong to me. It should be "decrement," since otherwise we'd be unable to correctly count segments starting at A that go to a right child before node B. You can find confirmation for this observation here: https://github.com/katlew/Cracking-Coding-Interview/blob/master/chapter_4/pathsWithSum.java

Upvotes: 1

uSeemSurprised
uSeemSurprised

Reputation: 1834

You have to find all possible paths, hence the frequency of the required sum is necessary. If you do count++ the answer only calculates it for one path instead of all the possible paths there might be.

Consider the following example :

                    2
             3             4
         1     4        3     1

Now consider the paths 2->3->4 and 2->4->3 they both sum to the same values but they are 2 different paths. If you only increase the count by 1, you would miss the other path therefore frequency is required.

Upvotes: 0

Related Questions