Skip to main content

Union by Rank and Path Compression: Near-alpha(n) Amortized

What This Concept Is

A disjoint-set forest represents each class as an up-tree: every element has a parent pointer, and the root is the class representative with its parent pointing to itself. Two heuristics together make this representation astonishingly fast:

  • Union by rank (or size): when merging two trees, make the root of the smaller (by rank, an upper bound on height) become a child of the larger root. This keeps trees shallow.
  • Path compression: during find(x), after locating the root, rewrite every node on the find-path to point directly at the root.

With both heuristics, any sequence of m operations on n elements runs in O(m * alpha(n)) total time, where alpha(n) is the inverse Ackermann function. For any n that fits in the observable universe, alpha(n) <= 4, so the amortized cost per operation is effectively a small constant.

Neither heuristic alone is good enough. Union by rank alone gives O(log n) worst-case per operation (rank bounds height by log2 n). Path compression alone, applied to arbitrarily linked trees, gives O(log n) amortized per operation. Together they give O(alpha(n)) amortized -- a lower bound Tarjan proved is tight for this class of algorithms. The synergy comes from compression's effect on future find operations being amplified by the shallow structure that ranking produces.

A simple and equally fast variant is union by size: store the subtree size instead of rank, and always merge the smaller into the larger. The two bounds coincide asymptotically and size is often easier to expose to other queries (e.g., "what is the size of my component?").

Another useful optimization is path splitting / halving: instead of rewriting each node on the find-path to point at the root, point it at its grandparent. This halves the pointer cost per find and preserves the same alpha(n) bound with simpler iterative code.

Why It Matters Here

This is the canonical "amortized near-constant" result in computer science. It is also the single most important algorithmic reason union-find is used everywhere: you can run make-set, find, and union millions of times and the total cost is essentially linear in the number of operations.

It is also a case study in why a simple implementation with two heuristics beats much more elaborate designs that do not layer randomized or amortized ideas. Tarjan's analysis (1975, and the refined potential-method version in CLRS Chapter 19) is the first proof you will meet where the amortized cost is provably sub-logarithmic.

Concrete Examples

Example 1 -- rank-based union and path compression. Start with singletons 1..7. Ranks start at 0. Perform union(1, 2), union(3, 4), union(5, 6), union(1, 3), union(1, 5), tie-breaking in favor of the first argument when ranks are equal.

After the first three unions, ranks are 1 for roots 1, 3, 5. After union(1, 3), rank of 1 becomes 2. After union(1, 5), rank of 1 is still 2:

        1(rank 2)
/ | \
2 3 5
| |
4 6

Now call find(4). The find-path is 4 -> 3 -> 1. Path compression rewrites:

        1(rank 2)
/ | | | \
2 3 4 5 6

4 is now a direct child of the root. Every future find(4) is O(1).

Example 2 -- union by size tracking class cardinality. Initialize parent[i] = i, size[i] = 1. Sequence:

union(1, 2): roots 1, 2, sizes 1, 1; tie -> parent[2] = 1, size[1] = 2
union(3, 4): parent[4] = 3, size[3] = 2
union(1, 3): sizes 2, 2; tie -> parent[3] = 1, size[1] = 4
union(5, 6): parent[6] = 5, size[5] = 2
union(1, 5): sizes 4, 2; size[5] < size[1] -> parent[5] = 1, size[1] = 6

After the final union, size[find(4)] = 6 answers "how many elements in 4's class?" in O(alpha(n)). Augmenting with size is how problems like "number of connected components with more than 3 nodes" are solved in a single pass.

Common Confusion / Misconceptions

"Rank equals tree height." Rank is an upper bound on height. Path compression can shrink the actual height below the stored rank, but rank is never updated downward because doing so would complicate the amortized argument without changing asymptotics. In practice rank and height can diverge substantially after heavy compression.

"The bound is log*(n)." alpha(n) grows even slower than log*(n). Both are effectively constant for all practical n, but the correct bound proved by Tarjan is alpha(n), not log*(n). On a problem set, write alpha(n) unless you are explicitly proving a weaker bound.

"Path compression needs an explicit second pass." In the recursive version, path compression happens naturally on the return of the recursion -- p[x] = find(p[x]) does both. In iterative code, a single-pass "path halving" (point every other node at its grandparent) is simpler, faster in practice, and gives the same amortized bound.

"You cannot amortize randomized structures." Path compression is deterministic; the amortized bound holds worst-case over every operation sequence. No randomness is required. This contrasts with skip lists (randomized balance) and hash tables (randomized collision), which rely on probabilistic arguments.

How To Use It

Standard implementation (parent array p[], rank array r[]):

make-set(x):   p[x] = x;  r[x] = 0

find(x):
if p[x] != x:
p[x] = find(p[x]) # path compression
return p[x]

union(x, y):
rx = find(x); ry = find(y)
if rx == ry: return
if r[rx] < r[ry]: p[rx] = ry
else if r[rx] > r[ry]: p[ry] = rx
else: p[ry] = rx; r[rx] += 1

Iterative path-halving version of find (production-ready, no stack):

find(x):
while p[x] != x:
p[x] = p[p[x]] # halving
x = p[x]
return x

Engineering checklist before shipping:

  1. pick union by size or union by rank (not both); size is friendlier to downstream queries
  2. use path halving in the hot path to avoid recursion depth issues on large n
  3. track a components counter decremented on every successful union
  4. if you need class aggregates, store them at the root and combine in union

Why this works, intuitively:

  • union by rank alone gives O(log n) worst-case per operation because rank bounds height by log2 n
  • path compression alone gives O(log n) amortized per operation
  • the two together give O(alpha(n)) amortized per operation, a far stronger bound than either alone

Transfer / Where This Shows Up Later

  • S5 (OS): mount-namespace unification, cgroup-v2 hierarchical merges, and NUMA-node grouping all use union-find-style bookkeeping when building transitive-closure structures.
  • S6 (databases): Kruskal-style MST on the entity-relationship graph (for schema refactoring tools) and query optimizer equivalence-class propagation (join reordering, predicate push-down) use union-find over attribute references.
  • S7 (architecture): reasoning about microservice reachability ("which services transitively depend on which?") is a union-find problem once you add an edge between services on every observed call.
  • S8 (scale): offline analytics pipelines that detect connected components in massive graphs (GraphX, Spark's ConnectedComponents) reduce to union-find over Bloom-filtered edge streams.

Check Yourself

  1. Why does union by rank keep the tree shallow?
  2. Why can path compression be done during find without breaking correctness?
  3. Why is the amortized bound alpha(n) instead of O(log n)?
  4. Why is rank never decreased even when compression shortens the actual height?
  5. What is the difference between path compression, path splitting, and path halving? Why do all three give the same amortized bound?
  6. Union by rank vs union by size -- when does the distinction matter?

Mini Drill or Application

  1. Starting from singletons 1..8, execute union(1,2), union(3,4), union(5,6), union(7,8), union(1,3), union(5,7), union(1,5) with union by rank. Draw the forest.
  2. Now call find(8). Draw the forest after path compression.
  3. Repeat the whole sequence without either heuristic (link second argument into first, no compression). Compare the final forest shape.
  4. Estimate alpha(n) for n = 10^80 and compare to log2 n and log* n.
  5. Implement union-find with path halving and union by size in Rust or Python (~50 lines). Run 10^7 mixed operations and measure average time per operation; confirm it is roughly constant across n = 10^4, 10^5, 10^6.

Read This Only If Stuck