TakeASadSong
TakeASadSong

Reputation: 17

Inserting elements from a sorted array into a BST in an efficient way

I am struggling with this exercise:

Given a BST T, whose nodes contain only a key field, a left and right field and a sorted array A which contains m keys. Write an efficient algorithm which inserts into T any A's keys which are not already present in T. You must not apply the InsertBST(T,key) algorithm on single A's keys.

For example if the BST contains 1,3,4,5 and A contains 1,2,5,6, I have to insert 2,6 without using InsertBST(T,2) and InsertBST(T,6).

This is what I tried:

Algo(T,A,index)
     if T != null then
          Algo(T->sx,A,index)
          p = NewNode()
          p->key = A[index]
          if T->key > A[index] then
               T->sx = p
               index = index + 1
          else if T->key < A[index] then
               T->dx = p
               index = index + 1
          Algo(T->dx,A,index)

But it inserts nodes at the wrong places. Can you help me?

Upvotes: 0

Views: 200

Answers (1)

trincot
trincot

Reputation: 350137

I see the following issues with your attempt:

  1. In each recursive call you are going to insert a node. So for instance, also when you reach the first, left-most leaf. But this cannot be right. The first node insertion may have to happen completely elsewhere in the tree without affecting this leaf at all

  2. There is no check that index runs out of bounds. The code assumes that there are just as many values to insert as there are nodes in the tree.

  3. There is no provision for when the value is equal to the value of an existing node in the tree. In that case the value should not be added at all, but ignored.

  4. In most languages index will be a local variable to the Algo function, and so updating it (with + 1) will have no effect to the index variable that the caller has. You would need a single index variable that is shared across all executions of Algo. In some languages you may be able to pass index by reference, while in other languages you may be able to access a more global variable, which is not passed as argument.

Algorithm

The algorithm to use is as follows:

Perform the inorder traversal (as you already did). But keep track of the previously visited node. Only when you detect that the next value to be inserted is between the values of those two nodes, you need to act: in that case create a new node and insert it. Here is how to detect where to insert it. Either:

  1. the current node has no left child, which also means the previous node (in the inorder sequence) is not existing or is higher up the tree. In this case the new node should become the left child of the current node

  2. the current node has a left child, which also means the previous node is in the left subtree and has no right child of its own. In that case, add the new node as right child of the previous node.

The Sentinel idea

There is a possibility that after visiting the last node in inorder sequence, there are still values to insert, because they are greater than the greatest value in the tree. Either you must treat that case separately after the recursive traversal has finished, or you can begin your algorithm by adding a new, temporary root to your tree, which has a dummy, but large value -- greater than the greatest value to insert. Its left child is then the real root. With that trick you are sure that the recursive logic described above will insert nodes for all values, because they will all be less than that temporary node's value.

Implementation in Python

Here is the algorithm in Python syntax, using that "sentinel" node idea:

def insertsorted(root, values):
    if not values:
        return root  # Nothing to do

    # These variables are common to all recursive executions:
    i = 0  # index in values
    prev = None  # the previously visited node in inorder sequence

    def recur(node):
        nonlocal i, prev  # We allow those shared variables to be modified here
        if i < len(values) and node.left:
            recur(node.left)
        while i < len(values) and values[i] <= node.value:
            if values[i] < node.value:
                newnode = Node(values[i])
                if node.left is None:
                    node.left = newnode
                else:
                    prev.right = newnode
                prev = newnode
            i += 1
        prev = node
        if i < len(values) and node.right:
            recur(node.right)
    
    # Create a temporary node that will become the parent of the root
    # Give it a value greater than the last one in the array
    sentinel = Node(values[-1] + 1)  
    sentinel.left = root
    recur(sentinel)
    # Return the real root to the caller
    return sentinel.left

Upvotes: 1

Related Questions