Binary search trees work best when they are balanced or the path
length from root to any leaf is within some bounds. The red-black tree algorithm is a method
for balancing trees. The name derives from the fact that each node is colored *red* or
*black*
and the color of the node is instrumental in determining the balance of the tree.
During insert and delete operations nodes may be rotated to maintain tree balance. Both average
and worst-case insert
delete
and search time is O(lg *n*).
For details
consult
Cormen
[2001].

A red-black tree is a balanced binary search tree with the following properties:

- Every node is colored red or black.
- Every leaf is a sentinel node and is colored black.
- If a node is red then both its children are black.
- Every simple path from a node to a descendant leaf contains the same number of black nodes.
- The root is always black.

The number of black nodes on a path from root to leaf is known as the black-height of a tree. The above properties guarantee that any path from the root to a leaf is no more than twice as long as any other path. To see why this is true consider a tree with a black-height of three. The shortest distance from root to leaf is two (B-B-B). The longest distance from root to leaf is four (B-R-B-R-B). It is not possible to insert more black nodes as this would violate property 4. Since red nodes must have black children (property 3) having two red nodes in a row is not allowed. The largest path we can construct consists of an alternation of red and black nodes.

In general
given a tree with a black-height of *n*
the shortest distance from root
to leaf is *n* - 1
and the longest distance is 2(*n* - 1). All operations on the
tree must maintain the properties listed above. In particular
operations that insert or delete
nodes from the tree must abide by these rules.

To insert a node
search the tree for an insertion point and add the node
to the tree. The new node replaces an existing sentinel node
at the bottom of the tree
and has two
sentinel nodes as children. In the implementation
a sentinel node
is simply a pointer to a common *sentinel* node that is colored black.
After insertion the new node is colored
red. Then the parent of the node is examined to determine if the red-black
tree properties have been
maintained. If necessary
make adjustments to balance the tree.

The black-height property (property 4) is preserved when we insert a red node with two sentinel children. We must also ensure that both children of a red node are black (property 3). Although both children of the new node are black (they’re sentinel) consider the case where the parent of the new node is red. Inserting a red node under a red parent would violate this property. There are two cases to consider.

Figure 3-6 illustrates a red-red violation. Node `X`

is the newly inserted node
and both parent and uncle are colored red. A simple recoloring removes the red-red violation.
After recoloring the grandparent (node `B`

) must be checked for validity
as its
parent may also be red and we can't have two red nodes in a row. This has the effect of propagating
a red node up the tree. On completion the root of the tree is marked black. If it was originally
red the black-height of the tree increases by one.

**Figure 3-6: Insertion - Red Parent
Red Uncle**

Figure 3-7 illustrates a red-red violation where the uncle is colored black. If we attempt
to recolor nodes
changing node `A`

to black
the tree is out of balance since we've increased
the black-height of the left branch without changing the right branch. If we also change node
`B`

to red
then the black-height of both branches is reduced and the tree is still out of balance.
If we start over and change node `A`

to black and node `C`

to red the situation is worse since
we've increased the black-height of the left branch
and decreased the black-height of the
right branch. To solve this problem we will rotate and recolor the nodes as shown. At this
point the algorithm terminates since the top of the subtree (node `A`

) is colored
black and no red-red conflicts were introduced.

**Figure 3-7: Insertion - Red Parent
Black Uncle**

To insert a node we may have to recolor or rotate to preserve the red-black tree properties.
If rotation is done
the algorithm terminates. For simple recolorings we're left with a red
node at the head of the subtree and must travel up the tree one step and repeat the process
to ensure the black-height properties are preserved. In the worst case we must go all the way
to the root. Timing for insertion is O(lg *n*). The technique
and timing for deletion is similar.

An ANSI-C implementation for red-black trees is
included. Typedefs
`recType`

`keyType`

and comparison operators `compLT`

and
`compEQ`

should be altered to reflect the data stored in the tree.
Typedef `NodeType`

defines each node and consists of `left`

`right`

and `parent`

pointers
designating each child and the parent. The node color is stored in `color`

and is either `RED`

or `BLACK`

. All leaf nodes of the tree are `SENTINEL`

nodes
to simplify coding. The tree is based at `root`

and initially is
a `SENTINEL`

node.

Function `rbtInsert`

allocates a new node and inserts it in the
tree. Subsequently
it calls `insertFixup`

to ensure that the red-black
tree properties are maintained. Function `rbtErase`

deletes a
node from the tree. To maintain red-black tree properties
`deleteFixup`

is called. Function `rbtFind`

searches
the tree for a particular value.

Consider an industrial-strength version of the red-black tree algorithm. Available as three separate files:

- include file: function prototypes
- implementation file: red-black tree algorithm
- driver file: test code

Modifications include:

- reentrant code
- iterator support
- void pointers for both
**key**and**value**to support arbitrary types **compare**function callback

In addition there is total isolation between the driver and red-black tree
data structures. For example
the driver does not have access to the `NodeType`

data structure or
the inner-workings of the algorithm.

The red-black tree algorithm has been implemented as objects using a module for the algorithm and a class for the nodes. It has also been implemented as a class using arrays. The array implementation is recommended.