Tanmoy Banerjee
Tanmoy Banerjee

Reputation: 1453

What are the minimum number of keys a node must contain for a B Tree of order n?

I have two books of Data Structures.In two books, there is two different approach of B-Tree insertion:

Suppose i want to inset a value k into a B-Tree. after searching for a appropriate leaf node to insert the value k, values present in the particular leaf node is counted. If leaf node is full then :

1.A single median is chosen from among the leaf's elements and the new element.After that split the node into two nodes.median value is shifted to parent node.

2.Split the node into two nodes after m/2-1(m is order of tree)position.median value is shifted to parent node.After that insert the value to the appropriate position.

Which one of the following method is correct?

Another Question:

What are the minimum number of keys a node must contain for a B-Tree of order n?I have searched the internet and books but unable to get the exact equation by which I can find out the minimum no of keys: e,g if order=5 then minimum number of keys for any node(except root)is 2 .How is this coming?

Any answers would be greatly appreciated!

Upvotes: 3

Views: 36209

Answers (1)

Justin
Justin

Reputation: 4216

According to Knuth's definition, a B-tree of order m is a tree which satisfies the following properties:

  • Every node has at most m children.
  • Every non-leaf node (except root) has at least ceil(m⁄2) children.
  • The root has at least two children if it is not a leaf node.
  • A non-leaf node with k children contains k−1 keys.
  • All leaves appear in the same level, and carry information.

Each internal node’s keys act as separation values which divide its subtrees. For example, if an internal node has 3 child nodes (or subtrees) then it must have 2 keys: a1 and a2. All values in the leftmost subtree will be less than a1, all values in the middle subtree will be between a1 and a2, and all values in the rightmost subtree will be greater than a2.

Internal nodes

Internal nodes are all nodes except for leaf nodes and the root node. They are usually represented as an ordered set of elements and child pointers. Every internal node contains a maximum of U children and a minimum of L children. Thus, the number of elements is always 1 less than the number of child pointers (the number of elements is between L−1 and U−1). U must be either 2L or 2L−1; therefore each internal node is at least half full. The relationship between U and L implies that two half-full nodes can be joined to make a legal node, and one full node can be split into two legal nodes (if there’s room to push one element up into the parent). These properties make it possible to delete and insert new values into a B-tree and adjust the tree to preserve the B-tree properties.

The root node

The root node’s number of children has the same upper limit as internal nodes, but has no lower limit. For example, when there are fewer than L−1 elements in the entire tree, the root will be the only node in the tree, with no children at all.

Leaf nodes

Leaf nodes have the same restriction on the number of elements, but have no children, and no child pointers.

http://en.wikipedia.org/wiki/B-Tree

ADDING:

A B-Tree of order 5 (max number of children) [4 would be the max number of keys] using Knuths definition.

The minimum number of children for an internal node after a split would be 3 [2 keys]. Since a node splits when it overflows the order of the B-Tree.

B-Tree for the list 11108, 3267, 11357, 12080, 6092

After adding 3267, 11108, 11357, 12080; this is showing keys, not children.

└── 3267, 11108, 11357, 12080

Then adding 6092; this is showing keys, not children.

Before split (number of keys > order-1) [5 > 5-1=4]:

└── 3267, 6092, 11108, 11357, 12080

After split:

└── 11108
    ├── 3267, 6092
    └── 11357, 12080

Note: the root node does not have the min number of keys/children as other nodes.

DELETING:

Rebalancing after deletion

If deleting an element from a leaf node has brought it under the minimum size, some elements must be redistributed to bring all nodes up to the minimum. In some cases the rearrangement will move the deficiency to the parent, and the redistribution must be applied iteratively up the tree, perhaps even to the root. Since the minimum element count doesn't apply to the root, making the root be the only deficient node is not a problem. The algorithm to rebalance the tree is as follows:[citation needed]

  1. If the right sibling has more than the minimum number of elements

    • Add the separator to the end of the deficient node
    • Replace the separator in the parent with the first element of the right sibling
    • Append the first child of the right sibling as the last child of the deficient node
  2. Otherwise, if the left sibling has more than the minimum number of elements

    • Add the separator to the start of the deficient node
    • Replace the separator in the parent with the last element of the left sibling
    • Insert the last child of the left sibling as the first child of the deficient node
  3. If both immediate siblings have only the minimum number of elements

    • Create a new node with all the elements from the deficient node, all the elements from one of its siblings, and the separator in the parent between the two combined sibling nodes
    • Remove the separator from the parent, and replace the two children it separated with the combined node
    • If that brings the number of elements in the parent under the minimum, repeat these steps with that deficient node, unless it is the root, since the root is permitted to be deficient

The only other case to account for is when the root has no elements and one child. In this case it is sufficient to replace it with its only child.

Pre-deleting node 9062.

└── 6169, 12789
    ├── 1009, 4238, 5139
    ├── 6625, 9062
    └── 12909, 14508, 14703, 14985

Before balancing

└── 6169, 12789
    ├── 1009, 4238, 5139
    ├── 6625 
    └── 12909, 14508, 14703, 14985

After balancing

└── 6169, 12909
    ├── 1009, 4238, 5139
    ├── 6625, 12789
    └── 14508, 14703, 14985

As you can see, middle node child borrowed 12789 from it's parent to meet the minimum requirements and the parent borrowed 12909 from it's right most child to meet it's minimum requirements.

Upvotes: 8

Related Questions