Why dont we use 2-3 or 2-3-4-5 trees?

I have a basic understanding of how 2-3-4 trees maintain the height balance property operation after operation to make sure even the worst case operations are O(n logn).

But I do not understand it well enough to know why only 2-3-4?

Why not 2-3 or 2-3-4-5 etc?

Upvotes: 9

Answers (3)

Lemmore

Reputation: 11

At my algoritm course they told us that they are commonly used for acessing memory from hard disc - known as B/B+ trees. You make tree that store sizeof your avabile ram and by doing so you minimalize number of reed from disc operation (if you made B with node that store for example 10^8 elements you only need log_10^8(n) reed from disc operation to find something on hard disc which is nothing. So something that you called 2-3-4-5-... trees is in fact widespread solution.

Upvotes: 1

Jeff Wiegley

Reputation: 181

Implementation of 2-3-4 trees typically requires either multiple classes (2NODE, 3NODE, 4NODE) or you have just NODE that has an array of items. In the case of multiple classes you waste lots of time constructing and destructing node instances and reparenting them is cumbersome. If you use a single class with arrays to hold items and children then you are either resizing arrays constantly which is similarly wasteful or you wind up wasting over half your memory on unused array elements. It's just not very efficient compared to Red-Black trees.

Red-Black trees have only one type of node structure. Since Red-Black trees have a duality with 2-3-4 trees, RB trees can use the exact same algorithms as 2-3-4 trees (no need for the stupidly confusing/complex implementations described in Cormen, Leiserson and Rivest that led to AA trees which are not less complex than the 2-3-4 algorithm.)

So, Red-Black trees for their ease of implementation plus their memory/CPU efficiency. (AVL trees are nice too. They produce more well balanced trees and are stupid simply to code but they tend to be less efficient due to working too often to maintain only a slightly more compact tree.)

Oh, and 2-3-4-5-6... etc aren't done because nothing is gained. 2-3-4 has a net-gain over 2-3 trees because they can be done without recursion easily (recursion tends to be less efficient, especially when it cannot be coded tail-recursively). However, B-Trees and Bplus-Trees are pretty much 2-3-4-5-6-7-8-9-etc trees where the max size of the nodes, n, is chosen so that n records can be stored in a single disk sector. (i.e. each disk sector is a node in the tree and the size of the sector is equivalent to the number of items stored in the node.) This is because the time to search through 512 records linearly in memory is still MUCH faster than traversing down a level in the tree which requires another disk seek/fetch. and O(512) is still O(1) and thus maintains O(lg n) for the tree.

Upvotes: 18

cha0site

Reputation: 10737

To be honest, I wasn't aware of 2-3-4 trees. At my Data Structures class, we were taught 2-3 trees, and to be honest, most of us implemented AVL trees for the wet part of the exercise.

But apparently, there's a generalization of this type of tree: (a,b) tree.

Upvotes: 1

Why dont we use 2-3 or 2-3-4-5 trees?

Answers (3)

Related Questions