philipp
philipp

Reputation: 16495

Why storing data only in the leaf nodes of a balanced binary-search tree?

I have bought a nice little book about computational geometry. While reading it here and there, I often stumbled over the use of this special kind of binary search tree. These trees are balanced and should store the data only in the leaf nodes, whereas inner nodes should only store values to guide the search down to the leaves.

The following image shows an example of this trees (where the leaves are rectangles and the inner nodes are circles).

illustration of a binary search tree

I have two questions:

  1. What is the advantage of not storing data in the inner nodes?

  2. For the purpose of learning, I would like to implement such a tree. Therefore, I thought it might be a good idea to use an AVL tree as the basis, but is it a good idea?

Any kind of helpful resource is very welcome.

Upvotes: 10

Views: 3904

Answers (5)

vlad
vlad

Reputation: 4778

What is the advantage of not storing data in the inner nodes?

There are some tree data structures that, by design, require that no data is stored in the inner nodes, such as Huffman code trees and B+ trees. In the case of Huffman trees, the requirement is that no two leaves have the same prefix (i.e. the path to node 'A' is 101 whereas the path to node 'B' is 10). In the case of B+ trees, it comes from the fact that it is optimized for block-search (this also means that every internal node has a lot of children, and that the tree is usually only a few levels deep).

For the purpose of learning, I would like to implement such a tree. Therefore, I thought it might be a good idea to use an AVL tree as the basis, but is it a good idea?

Sure! An AVL tree is not extremely complicated, so it's a good candidate for learning.

Upvotes: 6

mikyra
mikyra

Reputation: 10357

What is the advantage of not storing data in the inner nodes?

In general, there is no advantage in not storing data in the inner nodes. For example, a red-black tree is a balanced tree and it stores its data into the inner and leaf nodes.

For the purpose of learning, I would like to implement such a tree. Therefore, I thought it might be a good idea to use an AVL tree as the basis, but is it a good idea?

In my opinion, it is.

Upvotes: 1

mszlazak
mszlazak

Reputation: 59

I am reading the same book and they say it could be done either way, data storage at external or at internal nodes.

The trees they use are Red-Black.

In any case, here is an article that stores data at internal nodes of a Red Black Tree and then links these data nodes together as a list.

Balanced binary search tree with a doubly linked list in C++ by Arjan van den Boogaard

http://archive.gamedev.net/archive/reference/programming/features/TStorage/default.html

Upvotes: -1

Chris Okasaki
Chris Okasaki

Reputation: 4955

It is common to have other kinds of binary trees with data at the leaves instead of the interior nodes, but fairly uncommon for binary SEARCH trees.

One reason you might WANT to do this is educational -- it's often EASIER to implement a binary search tree this way then the traditional way. Why? Almost entirely because of deletions. Deleting a leaf is usually very easy, whereas deleting an interior node is harder/messier. If your data is only at the leaves, then you are always in the easy case!

It's worth thinking about where the keys on interior nodes come from. Often they are duplicates of keys that are also at the leaves (with data). Later, if the key at the leaf is deleted, the key at the interior nodes might still hang around.

Upvotes: 2

Mark Wilkins
Mark Wilkins

Reputation: 41222

One benefit to only keeping the data in leaf nodes (e.g., B+ tree) is that scanning/reading the data is exceedingly simple. The leaf nodes are linked together. So to read the next item when you are at the "end" (right or left) of the data within a given leaf node, you just read the link/pointer to the next (or previous) node and jump to the next leaf page.

With a B tree where data is in every node, you have to traverse the tree to read the data in order. That is certainly a well-defined process but is arguably more complex and typically requires more state information.

Upvotes: -1

Related Questions