Reputation: 2512
Let's say I have binary trees A and B and I want to know if A is a "part" of B. I am not only talking about subtrees. What I want to know is if B has all the nodes and edges that A does.
My thoughts were that since tree is essentially a graph, and I could view this question as a subgraph isomorphism problem (i.e. checking to see if A is a subgraph of B). But according to wikipedia this is an NP-complete problem.
http://en.wikipedia.org/wiki/Subgraph_isomorphism_problem
I know that you can check if A is a subtree of B or not with O(n) algorithms (e.g. using preorder and inorder traversals to flatten the trees to strings and checking for substrings). I was trying to modify this a little to see if I can also test for just "parts" as well, but to no avail. This is where I'm stuck.
Are there any other ways to view this problem other than using subgraph isomorphism? I'm thinking there must be faster methods since binary trees are much more restricted and simpler versions of graphs.
Thanks in advance!
EDIT: I realized that the worst case for even a brute force method for my question would only take O(m * n), which is polynomial. So I guess this isn't a NP-complete problem after all. Then my next question is, is there an algorithm that is faster than O(m*n)?
Upvotes: 1
Views: 113
Reputation: 17238
you could compare the trees in bottom-up as follows:
you succeed, if you haven't failed ;-).
the main part of the algorithm runs in O(e_B)
- in the worst case, all edges in B are visited a constant number of times. the leaf node matching will run in O(n_A * log n_B)
if there the B vertices are sorted, O(n_A * log n_A + n_B * log n_B + n) = O(n_B * log n_B)
(sort each node set, lienarly scan the results thereafter) otherwise.
EDIT: re-reading your question, abovementioned step 2 is even easier, as for matching nodes in A, B, their parents must match too (otheriwse there would be a mismatch between the edge sets). no effect on worst-case run time, of course.
Upvotes: 0
Reputation: 13177
I would approach this problem in two steps:
A
in B
(either BFS of DFS)A
is contained in B
(giving that starting node), using a recursive algorithm, as below (I concocted same crazy pseudo-language, because you didn't specify the language. I think this should be understandable, no matter your background). Note that a
is a node from A
(initially the root) and b
is a node from B
(initially the node found in step 1)function checkTrees(node a, node b) returns boolean
if a does not exist or b does not exist then
// base of the recursion
return false
else if a is different from b then
// compare the current nodes
return false
else
// check the children of a
boolean leftFound = true
boolean rightFound = true
if a.left exists then
// try to match the left child of a with
// every possible neighbor of b
leftFound = checkTrees(a.left, b.left)
or checkTrees(a.left, b.right)
or checkTrees(a.left, b.parent)
if a.right exists then
// try to match the right child of a with
// every possible neighbor of b
leftFound = checkTrees(a.right, b.left)
or checkTrees(a.right, b.right)
or checkTrees(a.right, b.parent)
return leftFound and rightFound
About the running time: let m
be the number of nodes in A
and n
be the number of nodes in B
. The search in the first step takes O(n)
time. The running time of the second step depends on one crucial assumption I made, but that might be wrong: I assumed that every node of A
is equal to at most one node of B
. If that is the case, the running time of the second step is O(m)
(because you can never search too far in the wrong direction). So the total running time would be O(m + n)
.
While writing down my assumption, I start to wonder whether that's not oversimplifying your case...
Upvotes: 2