Red-Black Trees: Invariants, Rotations, and O(log n) Guarantees
What This Concept Is
A red-black tree is a BST that enforces balance using five invariants. Every node is colored either red or black, and:
- the root is black
- every leaf (sentinel
NIL) is black - red nodes have black children (no two reds in a row)
- every root-to-leaf path contains the same number of black nodes (the black-height)
- standard BST property holds
Those invariants force the tree to be approximately balanced: the longest root-to-leaf path is at most twice the shortest one. That bounds the height by 2 log(n + 1), so every operation runs in O(log n) worst-case.
Rotations are the local restructuring moves that restore the invariants after insertion or deletion. A single rotation moves one edge, preserves BST ordering, and changes local structure without touching the rest of the tree. Because every fixup path from the insertion point to the root uses at most two rotations for insert and three for delete, the total rotation cost per operation is O(1); the remaining O(log n) is recoloring and traversal.
A useful alternative framing: a red-black tree is isomorphic to a 2-3-4 tree (a B-tree of order 4). The black nodes correspond to internal 2-3-4 nodes and the red nodes to "glued" children inside the same 2-3-4 node. Every proof about red-black trees that is confusing in the coloring language is usually straightforward in the 2-3-4 language. Left-leaning red-black trees (Sedgewick 2008) enforce a canonical orientation that eliminates half the cases and is the shape actually implemented in Java's TreeMap.
Why It Matters Here
Red-black trees are the dominant deterministic balanced BST in practice:
- C++
std::map/std::setuse them - Java
TreeMapuses them - the Linux kernel uses them (
rbtree) for scheduler run-queues (CFS), virtual memory areas (VMAs), and the EPOLL watch set - they are the canonical example of invariant-driven balancing in CLRS
If you understand red-black trees, you understand the pattern that generalizes: define an invariant that caps height, restore it with local moves. The cost profile (O(1) amortized rotations, O(log n) worst-case time, low constant factors) is why they, and not AVL trees, win the production popularity contest.
Concrete Examples
Example 1 -- eight insertions and invariant check. Insert 7, 3, 18, 10, 22, 8, 11, 26 into an empty red-black tree. After all insertions and recolorings, one valid configuration is:
10(B)
/ \
7(B) 18(R)
/ / \
3(R) 11(B) 22(B)
/ \
8(R) 26(R)
Check the invariants:
- root
10is black - no red has a red child (
18red has black children,3,8,26are red with black parents) - every root-to-
NILpath has exactly two black nodes (black-height = 2)
The height is 4, which satisfies h <= 2 log(9) approx 6.3.
Example 2 -- red uncle recoloring. Starting from a tree with grandparent G(B), parent P(R), uncle U(R), and newly inserted N(R) where N is a left child of P:
G(B) G(R)
/ \ -> / \
P(R) U(R) P(B) U(B)
/ /
N(R) N(R)
The fixup recolors P and U black, flips G red, and recurses upward with G as the new "inserted node." This is the only one of the six insertion cases that can propagate up the tree; the rotation cases stop immediately. Each level of propagation is O(1) work, giving O(log n) worst-case recoloring per insert.
Common Confusion / Misconceptions
"Rotations break the BST property." They do not. A left rotation on node x takes x's right child y, makes y the new subtree root, and makes x the left child of y. The subtree rooted at the old x position still has all keys <= y <= all keys in y's right subtree, because BST ordering was already preserved by the previous structure.
"The inserted node should be colored black to match the parent." No. After insertion, the new node is colored red, not black. Inserting red never changes the black-height, so only invariant 3 (no red-red) can be violated, and the fixup cases restore it. Inserting black would violate invariant 4 (equal black-heights) and that violation is much harder to repair locally.
"Black-height counts all nodes on the path." Only the black nodes. The tree's height (which does count red nodes) is at most 2 * black-height, and that is why the 2 log(n + 1) bound holds.
"Delete is just insert in reverse." Delete is substantially harder. The "double-black" token in CLRS's deletion fixup exists because removing a black node short-circuits invariant 4 on one path. Expect six symmetric cases and plan your tests accordingly -- or use a left-leaning variant to halve the count.
How To Use It
Operational rules for insertion:
- BST-insert the new node as if it were an ordinary BST, then color it red
- while the parent is red, run a fixup case based on the uncle's color:
- uncle red: recolor parent and uncle black, grandparent red, move up
- uncle black, zig-zag shape: rotate to straighten
- uncle black, straight shape: rotate grandparent and recolor
- color the root black at the end
For deletion:
- BST-delete as usual, tracking the replacement node
- if a black node was removed, a "double-black" token moves up the tree
- fixup cases push the token up or eliminate it via rotations and recoloring
For choosing a variant:
- production code: reach for your language's standard library (
std::map,TreeMap,rbtree.h) - teaching code: implement left-leaning red-black (Sedgewick 2008); half the cases
- augmented uses (order statistics, interval trees): plain red-black with a per-node size or max field
The key performance fact: each insert or delete does at most O(1) rotations and O(log n) recolorings up the tree.
Transfer / Where This Shows Up Later
- S5 (OS): the Linux kernel
rbtreebacks the Completely Fair Scheduler (task ordered by vruntime), the VMA tree in the memory subsystem, and epoll's interest set. Readinginclude/linux/rbtree.his a rite of passage. - S6 (databases): B+trees on disk use the same 2-3-4 isomorphism; understanding red-black recoloring clarifies why B-tree splits and merges are local.
- S7 (architecture): ADRs choosing
HashMapvsTreeMap(orHashSetvsTreeSet) should cite "need ordered iteration / range queries" as the deciding factor, because that is the capability red-black trees pay for. - S8 (scale): load-balancers and rate-limiters that enforce token-bucket timers use red-black trees to schedule next-fire events (
timer_fdunder the hood).
Check Yourself
- State the five red-black invariants.
- Why is the height at most
2 log(n + 1)? - Why is a newly inserted node colored red, not black?
- What is the effect of a left rotation on in-order key ordering?
- How does the 2-3-4 tree correspondence let you predict the number of insert-fixup cases?
- For deletion, why does a "double-black" token arise only when a black node is removed and not when a red one is?
Mini Drill or Application
- Starting from an empty red-black tree, insert the keys
10, 20, 30, 15, 25, 5, 1one at a time. Draw the tree after each insertion, showing colors and rotations. - From the final tree, delete the root. Draw the result and name the fixup case.
- Compute the black-height of your final tree and verify invariant 4 by walking three different root-to-leaf paths.
- Implement insert-only red-black (Sedgewick LLRB style) in Python or Java in at most 80 lines; run on 1,000,000 random inserts and assert
height <= 2 * log2(n + 1).
Read This Only If Stuck
- CLRS: Properties of red-black trees
- CLRS: Rotations
- CLRS: Insertion (Part 1)
- CLRS: Insertion (Part 2)
- CLRS: Deletion (Part 1)
- CLRS: Deletion (Part 2)
- Sedgewick: Balanced trees (Part 1)
- Sedgewick: Balanced trees (Part 2)
- Sedgewick, Left-Leaning Red-Black Trees (Princeton, 2008)
- Pat Morin, Open Data Structures -- RedBlackTree chapter