Kruskal with Union-Find
What This Concept Is
Kruskal's algorithm builds an MST by sorting edges by weight and greedily adding each edge that does not create a cycle:
sort E in nondecreasing order of weight
make-set(v) for each v in V
MST := empty
for each edge (u, v, w) in sorted order:
if find(u) != find(v):
union(u, v)
add (u, v) to MST
return MST
The cycle check is the bottleneck. A union-find (disjoint-set) data structure answers "are these two vertices in the same component?" in effectively O(alpha(|V|)) amortized time, where alpha is the inverse Ackermann function - slower than O(log) but unmeasurably close to O(1) in practice.
Total time: O(|E| log |E|) for the sort plus O(|E| alpha(|V|)) for the union-find, i.e., O(|E| log |V|) overall (log E = Theta(log V) for simple graphs).
Two optimizations are essential to achieve the inverse-Ackermann bound:
- Union by rank (or size). Always attach the smaller tree under the larger root. Prevents unbalanced chains.
- Path compression. On
find(x), rewrite parent pointers so they point directly at the root.
Either optimization alone gives O(log |V|) per operation. Both together yield the celebrated O(alpha(|V|)) amortized bound (Tarjan).
Why It Matters Here
Kruskal is the cleanest demonstration of "greedy plus good bookkeeping" in this module. It also gives you the union-find data structure, which reappears in:
- connectivity queries under incremental edge additions
- offline-query algorithms (small-to-large, DSU on tree)
- Tarjan's offline LCA algorithm
- percolation simulations
- image-segmentation (Felzenszwalb-Huttenlocher)
When to prefer Kruskal over Prim:
- the edges are already (or can be cheaply) sorted
- the graph is sparse; Kruskal scales with
|E| - you need the MST edges in increasing weight order as a by-product (useful for clustering, k-th MST)
- the graph is given as an edge list (a common input format in competitive programming)
Concrete Example
Graph:
Sort: (a,b)=1, (b,c)=2, (b,d)=3, (c,d)=4, (a,c)=5, (c,e)=6, (d,e)=7.
Initial components: {a}, {b}, {c}, {d}, {e}.
| step | edge | find(u), find(v) | action | components after |
|---|---|---|---|---|
| 1 | (a,b)=1 | a, b | union, add | {a,b}, {c}, {d}, {e} |
| 2 | (b,c)=2 | a, c | union, add | {a,b,c}, {d}, {e} |
| 3 | (b,d)=3 | a, d | union, add | {a,b,c,d}, {e} |
| 4 | (c,d)=4 | a, a | skip (same) | unchanged |
| 5 | (a,c)=5 | a, a | skip (same) | unchanged |
| 6 | (c,e)=6 | a, e | union, add | {a,b,c,d,e} |
| 7 | (d,e)=7 | a, a | skip; also done (` | V |
Total MST weight: 1 + 2 + 3 + 6 = 12.
Common Confusion / Misconceptions
- "Union-find is just for MST." No. It is the workhorse for any "maintain components under adding edges" problem: connectivity queries, offline dynamic problems, percolation simulations.
- "Path compression alone is enough." It gives
O(log |V|)amortized per operation, not inverse-Ackermann. Combine with union-by-rank for the tight bound. - "Kruskal is
O(|E| log |E|)because of the sort." Correct, but notelog |E| = log |V^2| = 2 log |V|, so asymptotic equivalents are usually written asO(|E| log |V|). - "Kruskal needs a connected graph." If the graph is disconnected, Kruskal returns a minimum spanning forest.
- "The cycle check is an explicit DFS." It is
find(u) != find(v)- anO(alpha)operation - not a DFS. - "Integer weights make Kruskal
O(|E|)." Only with radix sort and specific conditions. The generic version isO(|E| log |V|).
How To Use It
Baseline implementation skeleton:
parent[v] := v; rank[v] := 0
find(x):
while parent[x] != x:
parent[x] := parent[parent[x]] # path compression (halving)
x := parent[x]
return x
union(x, y):
rx := find(x); ry := find(y)
if rx = ry: return false
if rank[rx] < rank[ry]: swap rx, ry
parent[ry] := rx
if rank[rx] = rank[ry]: rank[rx] := rank[rx] + 1
return true
Run Kruskal by sorting edges and calling union on each; stop when you have added |V| - 1 edges.
Integration checklist:
- verify edges are sorted stably (ties do not matter for MST correctness)
- early-exit after
|V| - 1edges to avoid processing the tail - for weighted graphs with floating-point weights, break ties deterministically if you need reproducibility
- if you want the minimum spanning forest, run Kruskal to completion (will produce multiple components)
Transfer / Where This Shows Up Later
- S3 (systems). File-system quota and cgroup membership sometimes use union-find for fast "same-group" queries.
- S4 (compilers). Static single assignment construction uses union-find for variable renaming; type unification uses it to merge equivalence classes.
- S5 (OS). Page-frame merging for huge pages uses union-find; kernel ID namespaces map via union-find.
- S6 (databases). Equi-join inference (which columns are guaranteed equal by transitivity) uses union-find.
- S7 (DDD). Merging bounded contexts at the "shared-kernel" level conceptually does a union on terminology sets.
- S8 (distributed). Partition detection after network splits: union-find over surviving nodes identifies the quorum-reachable set.
Back-link to concept 13 (cut and cycle properties) since they justify Kruskal's correctness, and forward to concept 16 (MST variants) for single-linkage clustering.
Check Yourself
- Why is the running time
O(|E| log |V|)rather thanO(|E| log |E|)asymptotically (hint:|E| <= |V|^2)? - What does the cut property guarantee about the edges Kruskal adds?
- Why does union by rank + path compression yield amortized inverse-Ackermann time?
- Why does stopping after
|V| - 1edges give a tree (not a forest or a multigraph)? - Sketch why
find(x)with path compression alone still costsO(log |V|)amortized but notO(alpha). - Give a production scenario where you would use union-find without Kruskal.
Mini Drill or Application
Implementation drill.
- Implement union-find with union-by-rank and path compression. Test against a naive
O(|V|)-per-op implementation on the same sequence of operations. - Implement Kruskal on top of your union-find. Verify correctness by comparing the total weight with a brute-force MST on small graphs (
|V| <= 8). - Use your implementation to solve a clustering task: connect each point to its
knearest neighbors, run Kruskal, and stop when you have|V| - cedges to producecclusters. - Apply Kruskal to a percolation problem: randomly order edges of a grid; run Kruskal until the top and bottom of the grid are in the same component.
- Benchmark Kruskal (union-find) vs a naive
O(|V|^2)Prim on random sparse graphs; observe Kruskal wins as density decreases.
Read This Only If Stuck
- CLRS 21.2: Kruskal and Prim
- CLRS 21.2: Kruskal and Prim (Part 2)
- CLRS 19.4: Analysis of union-by-rank with path compression
- ADM 6.1.2: Kruskal's Algorithm
- ADM 6.1.3: The Union-Find Data Structure
- Sedgewick: Connectivity (union-find context)
- Competitive Programming 4.3: Minimum Spanning Tree
- Competitive Programming 4.3.4: Other MST applications
- cp-algorithms: MST - Kruskal
- Sedgewick & Wayne, algs4 booksite: MSTs