Why is inorder and preorder traversal useful for creating an algorithm to decide if T2 is a subtree of T1

I'm looking at an interview book and the question is:

You have two very large binary trees: T1, with millions of nodes, and T2, with hundreds of nodes. Create an algorithm to decide if T2 is a subtree of T1.

The authors mentions this as a possible solution:

Note that the problem here specifies that T1 has millions of nodes—this means that we should be careful of how much space we use. Let’s say, for example, T1 has 10 million nodes—this means that the data alone is about 40 mb. We could create a string representing the inorder and preorder traversals. If T2’s preorder traversal is a substring of T1’s preorder traversal, and T2’s inorder traversal is a substring of T1’s inorder traversal, then T2 is a substring of T1.

I'm not quite sure the logic behind as to why if these are true:

That T2 must be a substring (although I assume the author means subtree) of T1. Can I get an explanation to this logic?

EDIT: User BartoszMarcinkowski brings up a good point. Assume both trees have no duplicate nodes.

Upvotes: 9

Views: 897

Answers (3)

Daniel Imms
Daniel Imms

Reputation: 50229

Here is a counter-example to the method.

Consider the tree T1:

  B
 / \
A   D
   / \
  C   E
       \
        F

And the sub-tree T2:

  D
 / \
C   E

The relevant traversals are:

  • T1 pre-order: BADCEF
  • T2 pre-order: DCE
  • T1 in-order: ABCDEF
  • T2 in-order: CDE

While DCE is in BADCEF and CDE is in ABCDEF, T2 is not actually a sub-tree of T1. The author's definition of sub-tree must have been different or it was just a mistake.

Related question: Determine if a binary tree is subtree of another binary tree using pre-order and in-order strings

Upvotes: 1

Bartosz Marcinkowski
Bartosz Marcinkowski

Reputation: 6861

I think it is not true. Consider:

T2:

  2
 / \
1   3

inorder 123 preorder 213

and

T1:

      0
     / \
    3   3
   / \ 
  1   1
 / \ 
0   2


inorder 0123103 preorder 0310213

123 is substring of 0123103, 213 is substring of 0310213, but T2 is not subtree of T1.

Upvotes: 4

Łukasz Kidziński
Łukasz Kidziński

Reputation: 1623

Important assumption is that the tree has unique keys.

Now, note that preorder-traversal-string and inorder-traversal-string uniquely identify a binary tree.

Scatch of the proof:

Let T be a tree.

  • First object in preorder-traversal-string(T) is the root.
  • Find it in the in the inorder-traversal-string(T) - everything on left of that element is your left subtree L, let's call this substring inorder-traversal-string(L). Everything on right is your right subtree R.

Now, let's focus on the left subtree L.

  • Clearly all subtrees are separated (they don't mix) in both strings. They are represented as consecutive objects. The only problem is that a priori we don't know where preorder-traversal-string(L) ends in preorder-traversal-string(T).
  • Note that strings inorder-traversal-string(L) and preorder-traversal-string(L) have the same length. This gives as the place where to cut.
  • Now you have a subtree described as substrings inorder-traversal-string(L) and preorder-traversal-string(L) so you can repeat the procedure till the end.

Following those steps (inefficient but it is just for the proof) for all subtrees you will uniquely build the tree.

Thus, all subtrees of T1 are described uniquely by corresponding inorder-traversal-string and preorder-traversal-string.

Upvotes: 1

Related Questions