user5920484
user5920484

Reputation:

Red Black Tree, and condition for coloring

Recently I think about BST converting to RB Tree by coloring.

I means what is the sufficient and necessary condition that we can convert BST to RB Tree just by coloring without any other change in that BST? (i.e: just by check shortest and longest path is not twice more than shortest path, or specific height or any other condition...)

Upvotes: 1

Views: 1223

Answers (2)

ruakh
ruakh

Reputation: 183602

I believe that Matt Timmermans' answer is correct, but I don't find it very satisfying, since although it provides a good algorithm for determining if a binary tree is red-black–colorable, it doesn't really provide a characterization other than "run this algorithm" — and, worse yet, the algorithm refers to concepts that are specific to red-black trees, and that (IMHO) don't make sense outside that context.

So, below is a characterization that I think is more satisfying.


Let the "least-height" of a node be the least distance from it down to a ɴɪʟ descendant, and let its "greatest-height" be the greatest distance from it down to a ɴɪʟ descendant. That is:

  • ɴɪʟ.leastHeight = ɴɪʟ.greatestHeight = 0
  • for a non-ɴɪʟ node:
    • node.leastHeight = 1 + min(node.left.leastHeight, node.right.leastHeight)
    • node.greatestHeight = 1 + max(node.left.greatestHeight, node.right.greatestHeight)

So, for example, in this tree:

        2
       / \
      1   4
         /
        3

(where I've omitted the ɴɪʟ leaves for readability) we have these heights:

  • 1: least-height = greatest-height = 1
  • 2: least-height = 2; greatest-height = 3
  • 3: least-height = greatest-height = 1
  • 4: least-height = 1; greatest-height = 2

Theorem. A binary tree is red-black–colorable if and only if, for every single node, its greatest-height is at most double its least-height, or equivalently, its least-height is at least half its greatest-height.

(The above example does satisfy this rule; and indeed, we can make it a red-black tree by coloring 3 red and the rest black.)


To prove this, we need one more definition. Let the "black-height" of a node in a red-black tree be the number of black nodes on the path from it down to any ɴɪʟ descendant, including itself if it is black. (By the definition of a red-black tree, this value is the same no matter which ɴɪʟ descendant is chosen.) That is:

  • ɴɪʟ.blackHeight = 0
  • for a black node: node.blackHeight = 1 + node.left.blackHeight = 1 + node.right.blackHeight
  • for a red node: node.blackHeight = node.left.blackHeight = node.right.blackHeight

So, the "only if" direction of the theorem — that if a binary tree is red-black–colorable, then the greatest-height of any node is at most double its least-height — shouldn't be surprising, because that's sort of the whole point of a red-black tree. If the tree is red-black–colorable, choose one such coloring. For a black node with black-height b, every path from that node down to a ɴɪʟ descendant will include exactly b black nodes (including the node itself), so its least-height is at least b; and no path from that node down to a ɴɪʟ descendant can include more than b red nodes, so its greatest-height is at most 2b. For a red node with black-height b, every path from that node down to a ɴɪʟ descendant will include exactly b black nodes, so its least-height is at least b+1 (including the node itself); and no path from that node down a ɴɪʟ descendant can include more than b+1 red nodes (including the node itself), so its greatest-height is at most 2(b+1).

The "if" direction — that if the greatest-height of every single node is at most double its least-height, then the tree is red-black–colorable — is trickier.

The proof turns out to be a bit simpler if we allow the trees to have red roots. This doesn't affect the result, because given a tree that satisfies all of the red-black invariants except that its root is red, we can just recolor the root black without breaking those other invariants. But I don't want to redefine the term "red-black tree" in the middle of a proof about red-black trees, so I'll instead use the term "red-black subtree", in reference to the fact that a subtree of a red-black tree has to satisfy all of the red-black invariants except that its root can be red.

The proof involves mathematical induction, so I'll actually prove a slightly stronger claim that enables the inductive step:

Theorem. If every node in a binary tree has a greatest-height that is at most double its least-height, then for any integer b in the range [root.greatestHeight / 2, root.leastHeight], the tree can be colored to become a red-black subtree whose root has black-height b.

Unfortunately, the inductive step will involve jumping down two levels (considering root.left.left and root.left.right and root.right.left and root.right.right, instead of just root.left and root.right), so our base cases need to cover all cases where root or root.left or root.right is ɴɪʟ.

Base case #1 — root is ɴɪʟ: This is straightforward, since ɴɪʟ is inherently a red-black subtree, the range [ɴɪʟ.greatestHeight / 2, ɴɪʟ.leastHeight] is just {0}, and ɴɪʟ.blackHeight = 0.

Base case #2 — root has two ɴɪʟ children: The least-height and greatest-height are both 1, so the range [root.greatestHeight / 2, root.leastHeight] is just [½, 1], which contains only one integer, namely 1; and indeed, if we color the root black, we'll have a red-black subtree whose root has black-height 1.

Base case #3 — root has one ɴɪʟ child and one non-ɴɪʟ child: The least-height is 1, so by assumption, the greatest-height can be at most 2; so the non-ɴɪʟ child must have greatest-height 1, meaning that both of its children are ɴɪʟ. (In other words, this must be a tree with exactly two non-ɴɪʟ nodes.) The range [root.greatestHeight / 2, root.leastHeight] is just {1}; and indeed, if we color the root black and its non-ɴɪʟ child red, we'll have a red-black subtree whose root has black-height 1.

Inductive case — root has two non-ɴɪʟ children: We assume, by induction, that its four grandchildren root.{left,right}.{left,right} all satisfy the theorem; so, for example, the subtree rooted at root.right.left can be colored to become a red-black subtree with any black-height in the range [root.right.left.greatestHeight / 2, root.right.left.leastHeight]. Then:

  • For any integer b in the range
    [max(root.{left,right}.{left,right}.greatestHeight) / 2, min(root.{left,right}.{left,right}.leastHeight)], we can color all of root.{left,right}.{left,right} as red-black subtrees whose roots have black-height b.
  • So, for any integer b′ in the range
    [root.greatestHeight / 2, root.leastHeight]
    = [(2 + max(root.{left,right}.{left,right}.greatestHeight)) / 2, 2 + min(root.{left,right}.{left,right}.leastHeight)]
    = [1 + max(root.{left,right}.{left,right}.greatestHeight) / 2, 2 + min(root.{left,right}.{left,right}.leastHeight)],
    we have that either b′−1 or b′−2 (or both) is in the range
    [max(root.{left,right}.{left,right}.greatestHeight) / 2, min(root.{left,right}.{left,right}.leastHeight)],
    meaning that we can color all of root.{left,right}.{left,right} as red-black subtrees whose roots all have the same black-height and that black-height is either b′−1 or b′−2. So, we do so, and we color root.left and root.right black. We then color root either red or black, depending on whether its grandchildren have black-height b′−1 or b′−2 (red if the former, black if the latter), thereby ensuring that root itself is a red-black subtree with black-height b′, as desired.

Upvotes: 1

Matt Timmermans
Matt Timmermans

Reputation: 59368

A null binary tree is a red-black tree. A non-null binary tree is a red-black tree if:

  1. The root is black;

  2. the number of black nodes on any path from root to null is the same.

  3. no such path has two non-black (i.e., red) nodes in a row.

We'll refer to the number of black nodes on every path from root to null as the tree's "black-height".

In any non-null red-black tree, both children of the root have the same black-height and will also be red-black trees if you make sure their roots are colored black. Coloring a red root black will increase the black-height of the tree by 1, so if the children of a root are made into red-black trees, their heights may differ by at most 1.

Similarly, given two red-black trees with the same black-height, you can join them under a new black root to create a new red-black tree.

Given a red-black tree and a red-rooted tree with red-black tree children of the same black-height, you can also join them under a new black root.

Two red-rooted trees with red-black tree children can have their roots recolored and joined under a new root similarly.

Henceforth, a red root with red-black tree children of the same black-height will be referred to as a red-rooted tree.

Given this, we can define the condition for red-black colorability recursively like so:

A binary tree can be colored as a red-black tree with black-height X if and only if:

  1. it is null and X==0; OR
  2. both of its children can be colored as red-black trees or red-rooted trees with black-height X-1

A binary tree can be colored as a red-rooted tree with black-height X if and only if it is non-null and both of its children can be colored as red-black trees with black-height X;

Given any binary tree, then, we can calculate the black-heights at which it could be colored as a red-black tree or a red-rooted tree:

In pseudocode:

redAndBlackHeights(tree):
    if (tree == null):
        return ([],[0]); //only a red-black tree with bh=0
    (left_red_heights,left_black_heights) = redAndBlackHeights(tree.left)
    (right_red_heights,right_black_heights) = redAndBlackHeights(tree.right)
    
    red_heights = intersect(left_black_heights, right_black_heights)
    black_heights = intersect(
       x+1 for x in union(left_red_heights,left_black_heights)
       x+1 for x in union(right_red_heights,right_black_heights)
    )
    return (red_heights, black_heights)

A tree is colorable as a red-black tree if and only if redAndBlackHeights(tree) returns at most one black-rooted height.

Since there are at most O(log N) possible heights in a tree of size N, this takes O(N log N) time.

It turns out, actually, that all of the sets of heights are contiguous ranges, and if you represent them as such the algorithm takes O(N) time.

Upvotes: 1

Related Questions