Nobilis
Nobilis

Reputation: 7448

Implementing a deletion function for a binary search tree in C

I've been trying to implement a function in C that deletes a node in a binary tree that should (theoretically) take care of three all cases, i.e.:

Is there a way to handle the whole deletion function without checking separately each case? As a commenter below noted I do check for a lot of cases and perhaps the whole problem can be addressed recursively by checking for one fundamental case.

I'm particularly interested in the case where I delete a node within the tree that has a parent and itself is a parent of two children nodes.

Both answers below have been useful but I don't think they address the problem in its entirety.

Here's what I have:

typedef struct Node
{
    int key;
    int data;
    struct Node *left;
    struct Node *right;
    struct Node *parent;
} Node;

/* functions that take care of inserting and finding a node and also traversing and freeing the tree */
...

void delete(Node *root, int key)
{
    Node *target = find(root, key); // find will return the node to be deleted

    Node *parent = target->parent; // parent of node to be deleted

    // no children
    if (target->left == NULL && target->right == NULL)
    {
        // is it a right child
        if (target->key > parent->key)
            parent->right = NULL;
        // must be a left child
        else
            parent->left = NULL;

        free(target);
    }

    // one child
    else if ((target->left == NULL && target->right != NULL) || (target->left != NULL && target->right == NULL))
    {
        // here we swap the target and the child of that target, then delete the target
        Node *child = (target->left == NULL) ? target->right : target->left;
        child->parent = parent;
        if (parent->left == target) parent->left = child;
        else if (parent->right == target) parent->right = child;
        free(target);
    }
    // two children
      else
    {

        // find the largest node in the left subtree, this will be the node
        // that will take the place of the node to be deleted
        Node *toBeRepl = max(target->left);

        // assign the data of the second largest node
        target->key   = toBeRepl->key;
        target->data  = toBeRepl->data;

        // if new node immediately to the left of target 
        if (toBeRepl == target->left)
        {   
            target->left  = toBeRepl->left;
            Node *newLeft = target->left;
            if (newLeft != NULL) newLeft->parent = target;
        }
        else
        {
            delete(target->left, toBeRepl->key);
            // Node *replParent = toBeRepl->parent;
            // replParent->right = NULL;
        }
}

I would greatly appreciate your feedback.

edit: Just to clarify, I'm trying to delete a particular node without touching its subtrees (if there are any). They should remain intact (which I've handled by swapping the values of the node to be deleted and (depending on the case) one of the nodes of its substrees).

edit: I've used as a reference the following wikipedia article: http://en.wikipedia.org/wiki/Binary_search_tree#Deletion
Which is where I got the idea for swapping the nodes values in case of two children, particularly the quote:

Call the node to be deleted N. Do not delete N. Instead, choose either its in-order successor node or its in-order predecessor node, R. Replace the value of N with the value of R, then delete R.

There is some interesting code in C++ there for the above case, however I'm not sure how exactly the swap happens:

else    //2 children
{
        temp = ptr->RightChild;
        Node<T> *parent = nullptr;

        while(temp->LeftChild!=nullptr)
        {
                parent = temp;
                temp = temp->LeftChild;
        }
        ptr->data = temp->data;
        if (parent!=nullptr)
                Delete(temp,temp->data);
        else
                Delete(ptr->rightChild,ptr->RightChild->data);
}

Could somebody please explain what's going on in that section? I'm assuming that the recursion is of a similar approach as to the users comments' here.

Upvotes: 0

Views: 4225

Answers (3)

Nobilis
Nobilis

Reputation: 7448

I finished this a long time ago and I thought it would be good to add a sample answer for people coming here with the same problem (considering the 400+ views this question has accumulated):

/* two children */
else
{
    /* find the largest node in the left subtree (the source), this will be the node
     * that will take the place of the node to be deleted */
    Node* source = max(target->left);

    /* assign the data of that node to the one we originally intended to delete */
    target->key   = source->key;
    target->data  = source->data;

    /* delete the source */
    delete(target->left, source->key);
}

Wikipedia has an excellent article that inspired this code.

Upvotes: 0

Afonso Tsukamoto
Afonso Tsukamoto

Reputation: 1214

I would do it using recursion, assuming that you have null at the end of your tree, finding null would be the 'go back' or return condition.

One possible algorithm would be:

Node* delete(Node *aNode){
  if(aNode->right != NULL)
     delete(aNode->right);
  if(aNode->left != NULL)
     delete(aNode->left);
  //Here you're sure that the actual node is the last one
  //So free it! 
  free(aNode);
  //and, for the father to know that you're now empty, must return null
  return NULL;

}

It has some bugs, for sure, but is the main idea. This implementation is dfs like. Hope this helps.

[EDIT] Node *aNode fixed. Forgot the star, my bad.

Upvotes: 1

varevarao
varevarao

Reputation: 2186

I don't see any "inelegance" in the code, such formatting and commented code is hard to come by. But yes, you could reduce the if-else constructs in your delete function to just one case. If you look at the most abstract idea of what deletion is doing you'll notice all the cases basically boil down to just the last case (of deleting a node with two children).

You'll just have to add a few lines in it. Like after toBeRepl = max(left-sub-tree), check if it's NULL and if it is then toBeRepl = min(right-sub-tree).

So, Case 1 (No children): Assuming your max() method is correctly implemented, it'll return NULL as the rightmost element on the left sub-tree, so will min() on the right sub-tree. Replace your target with the toBeRepl, and you'll have deleted your node.

Case 2 (One child): If max() does return NULL, min() won't, or vice-versa. So you'll have a non-NULL toBeRepl. Again replace your target with this new toBeRepl, and you're done.

Case 3 (Two children): Same as Case 2, only you can be sure max() won't return NULL.

Therefore your entire delete() function would boil down to just the last else statement (with a few changes). Something on the lines of:

Node *toBeRepl = max(target->left);
if toBeRepl is NULL
{
  toBeRepl = min(target->right);
}

if toBeRepl is not NULL
{
  target->key = tobeRepl->key;
  target->data = toBeRepl->data;

  deallocate(toBeRepl); // deallocate would be a free(ptr) followed by setting ptr to NULL
}
else
{
  deallocate(target);
}

Upvotes: 1

Related Questions