Reputation: 105
I've seen contradicting claims about the worst case time complexity for a solution to the GeeksforGeeks challenge Check if Tree is Isomorphic:
Given two Binary Trees. Check whether they are Isomorphic or not.
Note:
Two trees are called isomorphic if one can be obtained from another by a series of flips, i.e. by swapping left and right children of several nodes. Any number of nodes at any level can have their children swapped. Two empty trees are isomorphic.
For example, the following two trees are isomorphic with the following sub-trees flipped: 2 and 3, NULL and 6, 7 and 8.
My question concerns the following algorithm that solves this:
isomorphic(Node root1, Node root2) {
if (root1 == null && root2 == null) {
return true;
}
if (root1 == null || root2 == null) {
return false;
}
if (root1.data != root2.data) {
return false;
}
return (isIsomorphic(root1.left, root2.left) &&
isIsomorphic(root1.right, root2.right)) ||
(isIsomorphic(root1.left, root2.right) &&
isIsomorphic(root1.right, root2.left));
}
The editorial that comes with this code challenge presents this algorithm and states:
Time Complexity: O(min(N1, N2)), where N1 and N2 are the number of nodes in the two trees. The algorithm traverses each tree's nodes only up to the point where the smaller tree ends. This is because if the smaller tree ends, the extra nodes in the larger tree would not have a corresponding node in the smaller tree, making the trees non-isomorphic. Therefore, the algorithm's time complexity depends on the number of nodes in the smaller of the two trees, which is represented by min(N1, N2).
Also a top-ranked and pinned article in their comment section, by Gevendra Verma, states about the same algorithm:
The time complexity of this algorithm is O(n), where n is the number of nodes in the tree. This is because we visit each node exactly once.
However, some other people claim in that comment section that it is not linear, using the following logic:
Suppose we have L levels marked as l to 0 where 0 is the bottom level. Now, T(l) = 4T(l-1)+c (Considering all 4 recursive checks to be of same time for simplicity) T(l) = 4^L + c(4+4^2+4^3+....+4^(l-1)) T(l) = 4^L+c*4^L
T(L) = c*4^L
Now, L is logN, right? So, T(L) = c*4^LogN
O(N) = 2^2*LogN = N^2
But then many other sources claim linear time complexity. So who is right here? The counter argument was this:
Either you call left and right of node and if you get true than you not call again for node's left and right . If you get false in first case no further call will happen from the the chosen nodes and you will call for for second right and left
I thought that perhaps since we make 2 comparisons at each level, we make 2h comparisons in total. Thus 2log(n) = n.
Which is the true time complexity of this particular algorithm? Who is right here?
Upvotes: 8
Views: 325
Reputation: 351029
It isn't linear. I have no absolute proof (yet), but good enough evidence in my opinion that in the worst case it is O(𝑛log𝑛), so I decided to post what I have:
The quote you made that defends a O(𝑛) complexity says:
Either you call left and right of node and if you get true than you not call again for node's left and right . If you get false in first case no further call will happen from the the chosen nodes and you will call for for second right and left
But this is not a correct reasoning. For a given execution of the function, that is not a base case, the first recursive call is made: isIsomorphic(root1.left, root2.left)
. Let's focus on root1.left
now: there might be a significant traversal into that subtree before this first call returns. If this returns false, isIsomorphic(root1.left, root2.right)
is called. Note that we potentially traverse a big part of root1.left
again! Now if anywhere in that traversal we get a repetition of this scenario, there is an exponential effect: 2 traversals of the same nodes becomes 4, ...etc.
Secondly, if we simply put a counter property in each node of the tree, and increment it for the node that is the first argument of the function, we can rapidly see that these counts can go (much) higher than just 1 or 2.
But those who counter it and say it is O(𝑛²) (as you have quoted) are wrong too: the argument that at every evaluation of the boolean expression (with the recursive calls), we could have 4 recursive calls doesn't work: if you have 4 recursive calls, then logically the outcome will be false (just think about it), but if all 4 "children" also maximize the count of their recursive calls, those would all return false too, which leads to a contradiction, as you need two of your recursive calls to return true and the other two to return false. So in general it is not possible to always reach that maximum of 4 calls.
In order to analyse the complexity we need to get a grip on what constitutes a worst in terms of 𝑛:
With some hours of testing with random samples I found that the following tree configurations produce the greatest number of calls with respect to 𝑛.
All values in the tree should be the same (I chose 0) except 5 values (I chose 1 for those). If we divide the bottom level of each of the two trees into 4 equal groups (left to right), then these 1-values should be located here:
Left tree: 00...01 00...00 00...01 00...01
Right tree: 00...00 10...00 00...00 10...00
For instance, if the chosen height is 3 (the number of edges on longest root-leaf path), then 𝑛 = 15, and we get this input:
____0____ ____0____
/ \ / \
_0_ _0_ _0_ _0_
/ \ / \ / \ / \
0 0 0 0 0 0 0 0
/ \ / \ / \ / \ / \ / \ / \ / \
0 1 0 0 0 1 0 1 0 0 1 0 0 0 1 0
I decided to count those calls where neither of the arguments is null
, with this extension of your function (in JavaScript):
function countIsomorphic(root1, root2) {
let count = 0; // To be returned
// Your function:
function isomorphic(root1, root2) {
if (root1 == null && root2 == null) {
return true;
}
if (root1 == null || root2 == null) {
return false;
}
count++; // Added
if (root1.data != root2.data) {
return false;
}
return (isomorphic(root1.left, root2.left) &&
isomorphic(root1.right, root2.right)) ||
(isomorphic(root1.left, root2.right) &&
isomorphic(root1.right, root2.left));
}
// Call the function
isomorphic(root1, root2);
return count; // Ignore the boolean result, but return the count instead
}
After again hours of analysing the numbers, I finally arrived at this formula for what countIsomorphic
will return, based on the height ℎ of (both) input trees:
(2ℎ−3)2ℎ+1 + 9
NB: I could only find this empirically, so this is where I must say it is not a proof.
As 𝑛 = 2ℎ+1 − 1, then ℎ = log2(𝑛 + 1) − 1, and this means we have a number of calls that is
(2log2(𝑛 + 1) − 5)(𝑛 + 1) + 9
...which means the worst case time complexity is O(𝑛log𝑛).
Here is a runnable snippet (in JavaScript) that counts the number of calls for heights between 1 and 14 and compares this count with what the above formula gives. They are equal and so I have quite some confidence this is also true for higher heights of trees (with the shape/pattern that I described earlier):
class Node {
constructor(data, left, right) {
this.data = data;
this.left = left;
this.right = right;
}
}
function countIsomorphic(root1, root2) {
let count = 0; // To be returned
// Your function:
function isomorphic(root1, root2) {
if (root1 == null && root2 == null) {
return true;
}
if (root1 == null || root2 == null) {
return false;
}
count++; // Added
if (root1.data != root2.data) {
return false;
}
return (isomorphic(root1.left, root2.left) &&
isomorphic(root1.right, root2.right)) ||
(isomorphic(root1.left, root2.right) &&
isomorphic(root1.right, root2.left));
}
// Call the function
isomorphic(root1, root2);
return count; // Ignore the boolean result, but return the count instead
}
function makePerfectTree(binary) {
let level = Array.from(binary, digit => new Node(digit));
while (level.length > 1) {
level.push(new Node(0, ...level.splice(0, 2)));
}
return level.pop();
}
function getCount(h) {
const length = 1 << h;
const binary1 = Array(length).fill(0);
const binary2 = Array(length).fill(0);
// Create (apparent) worst case pattern:
const quarter = 1 << (h-2);
const mid = 1 << (h-1);
binary1[quarter - 1] = 1;
binary2[quarter] = 1;
binary1[mid + quarter - 1] = 1;
binary2[mid + quarter] = 1;
binary1[length-1] = 1;
// Create two perfect trees of given height, and populate the
// leaf-level with the given values
const root1 = makePerfectTree(binary1);
const root2 = makePerfectTree(binary2);
// Run the algorithm to get back a call-count
return countIsomorphic(root1, root2);
}
for (let h = 1; h <= 14; h++) {
const actual = getCount(h);
const calculated = (2*h-3) * 2**(h+1) + 9;
console.log(`for height ${h}, the call count is ${actual}`);
if (actual !== calculated) throw `Got ${actual}, but calculated ${calculated}!`
}
Upvotes: 4