Skip to main content

Kruskal with Union-Find

What This Concept Is

Kruskal's algorithm builds an MST by sorting edges by weight and greedily adding each edge that does not create a cycle:

sort E in nondecreasing order of weight
make-set(v) for each v in V
MST := empty
for each edge (u, v, w) in sorted order:
if find(u) != find(v):
union(u, v)
add (u, v) to MST
return MST

The cycle check is the bottleneck. A union-find (disjoint-set) data structure answers "are these two vertices in the same component?" in effectively O(alpha(|V|)) amortized time, where alpha is the inverse Ackermann function - slower than O(log) but unmeasurably close to O(1) in practice.

Total time: O(|E| log |E|) for the sort plus O(|E| alpha(|V|)) for the union-find, i.e., O(|E| log |V|) overall (log E = Theta(log V) for simple graphs).

Two optimizations are essential to achieve the inverse-Ackermann bound:

  • Union by rank (or size). Always attach the smaller tree under the larger root. Prevents unbalanced chains.
  • Path compression. On find(x), rewrite parent pointers so they point directly at the root.

Either optimization alone gives O(log |V|) per operation. Both together yield the celebrated O(alpha(|V|)) amortized bound (Tarjan).

Why It Matters Here

Kruskal is the cleanest demonstration of "greedy plus good bookkeeping" in this module. It also gives you the union-find data structure, which reappears in:

  • connectivity queries under incremental edge additions
  • offline-query algorithms (small-to-large, DSU on tree)
  • Tarjan's offline LCA algorithm
  • percolation simulations
  • image-segmentation (Felzenszwalb-Huttenlocher)

When to prefer Kruskal over Prim:

  • the edges are already (or can be cheaply) sorted
  • the graph is sparse; Kruskal scales with |E|
  • you need the MST edges in increasing weight order as a by-product (useful for clustering, k-th MST)
  • the graph is given as an edge list (a common input format in competitive programming)

Concrete Example

Graph:

Sort: (a,b)=1, (b,c)=2, (b,d)=3, (c,d)=4, (a,c)=5, (c,e)=6, (d,e)=7.

Initial components: {a}, {b}, {c}, {d}, {e}.

stepedgefind(u), find(v)actioncomponents after
1(a,b)=1a, bunion, add{a,b}, {c}, {d}, {e}
2(b,c)=2a, cunion, add{a,b,c}, {d}, {e}
3(b,d)=3a, dunion, add{a,b,c,d}, {e}
4(c,d)=4a, askip (same)unchanged
5(a,c)=5a, askip (same)unchanged
6(c,e)=6a, eunion, add{a,b,c,d,e}
7(d,e)=7a, askip; also done (`V

Total MST weight: 1 + 2 + 3 + 6 = 12.

Common Confusion / Misconceptions

  • "Union-find is just for MST." No. It is the workhorse for any "maintain components under adding edges" problem: connectivity queries, offline dynamic problems, percolation simulations.
  • "Path compression alone is enough." It gives O(log |V|) amortized per operation, not inverse-Ackermann. Combine with union-by-rank for the tight bound.
  • "Kruskal is O(|E| log |E|) because of the sort." Correct, but note log |E| = log |V^2| = 2 log |V|, so asymptotic equivalents are usually written as O(|E| log |V|).
  • "Kruskal needs a connected graph." If the graph is disconnected, Kruskal returns a minimum spanning forest.
  • "The cycle check is an explicit DFS." It is find(u) != find(v) - an O(alpha) operation - not a DFS.
  • "Integer weights make Kruskal O(|E|)." Only with radix sort and specific conditions. The generic version is O(|E| log |V|).

How To Use It

Baseline implementation skeleton:

parent[v] := v; rank[v] := 0
find(x):
while parent[x] != x:
parent[x] := parent[parent[x]] # path compression (halving)
x := parent[x]
return x
union(x, y):
rx := find(x); ry := find(y)
if rx = ry: return false
if rank[rx] < rank[ry]: swap rx, ry
parent[ry] := rx
if rank[rx] = rank[ry]: rank[rx] := rank[rx] + 1
return true

Run Kruskal by sorting edges and calling union on each; stop when you have added |V| - 1 edges.

Integration checklist:

  • verify edges are sorted stably (ties do not matter for MST correctness)
  • early-exit after |V| - 1 edges to avoid processing the tail
  • for weighted graphs with floating-point weights, break ties deterministically if you need reproducibility
  • if you want the minimum spanning forest, run Kruskal to completion (will produce multiple components)

Transfer / Where This Shows Up Later

  • S3 (systems). File-system quota and cgroup membership sometimes use union-find for fast "same-group" queries.
  • S4 (compilers). Static single assignment construction uses union-find for variable renaming; type unification uses it to merge equivalence classes.
  • S5 (OS). Page-frame merging for huge pages uses union-find; kernel ID namespaces map via union-find.
  • S6 (databases). Equi-join inference (which columns are guaranteed equal by transitivity) uses union-find.
  • S7 (DDD). Merging bounded contexts at the "shared-kernel" level conceptually does a union on terminology sets.
  • S8 (distributed). Partition detection after network splits: union-find over surviving nodes identifies the quorum-reachable set.

Back-link to concept 13 (cut and cycle properties) since they justify Kruskal's correctness, and forward to concept 16 (MST variants) for single-linkage clustering.

Check Yourself

  1. Why is the running time O(|E| log |V|) rather than O(|E| log |E|) asymptotically (hint: |E| <= |V|^2)?
  2. What does the cut property guarantee about the edges Kruskal adds?
  3. Why does union by rank + path compression yield amortized inverse-Ackermann time?
  4. Why does stopping after |V| - 1 edges give a tree (not a forest or a multigraph)?
  5. Sketch why find(x) with path compression alone still costs O(log |V|) amortized but not O(alpha).
  6. Give a production scenario where you would use union-find without Kruskal.

Mini Drill or Application

Implementation drill.

  1. Implement union-find with union-by-rank and path compression. Test against a naive O(|V|)-per-op implementation on the same sequence of operations.
  2. Implement Kruskal on top of your union-find. Verify correctness by comparing the total weight with a brute-force MST on small graphs (|V| <= 8).
  3. Use your implementation to solve a clustering task: connect each point to its k nearest neighbors, run Kruskal, and stop when you have |V| - c edges to produce c clusters.
  4. Apply Kruskal to a percolation problem: randomly order edges of a grid; run Kruskal until the top and bottom of the grid are in the same component.
  5. Benchmark Kruskal (union-find) vs a naive O(|V|^2) Prim on random sparse graphs; observe Kruskal wins as density decreases.

Read This Only If Stuck