MST Variants and Applications
What This Concept Is
The MST is the tip of a broader family. Useful variants and close relatives:
- Maximum spanning tree. Replace
minwithmax(or negate weights) - same algorithms, same proofs. - Bottleneck spanning tree. Minimize the maximum edge weight rather than the sum. Any MST is also a minimum-bottleneck spanning tree, so
O(|E| log |V|)suffices; linear-time bottleneck trees exist via MBST reductions. - Second-best MST. The tree with second-smallest total weight; computable in
O(|V||E|)by swapping exactly one MST edge for the minimum-cost substitute on the cycle closed by each non-MST edge. - k-th MST. Enumerate MSTs in nondecreasing order of weight using the swap structure (matroid exchange).
- Steiner tree. Minimum-weight tree that spans a specified subset of "terminal" vertices (others optional). NP-hard; MST of the complete graph on terminals with shortest-path weights gives a 2-approximation (Kou-Markowsky-Berman).
- Minimum spanning arborescence (Edmonds / Chu-Liu). Directed analogue; chosen edges form a rooted tree oriented away from a root. Polynomial despite directed setting.
- Euclidean MST. MST of a point set in the plane; computable in
O(n log n)via the Delaunay triangulation, which contains the EMST. - Degree-constrained MST. Find an MST where no vertex has more than
kneighbors. NP-hard fork = 2(that's Hamiltonian path). - Minimum bottleneck path. Not strictly a tree variant but any MST answers all pairwise minimum-bottleneck paths simultaneously along its unique tree paths.
Recognizing that a problem is "MST plus one twist" saves weeks of reinvention.
Why It Matters Here
MST-shaped problems appear throughout systems and optimization:
- Network design. Cheapest wiring, pipelining, or fiber layout connecting all nodes.
- Clustering. Remove the
k-1heaviest MST edges to producekclusters maximizing the minimum inter-cluster gap (single-linkage clustering). - Approximation algorithms. MST-based 2-approximations for metric TSP and Steiner tree; 1.5-approximation (Christofides) starts from the MST.
- Image segmentation. Felzenszwalb-Huttenlocher uses MST-style merging of pixel components via a threshold-on-edge-weight stop rule.
- Bottleneck and reliability questions. Minimum-bottleneck paths reduce to MSTs; reliable network backbones often use a max-weight MST under reliability scores.
- Leader election / broadcast trees. Distributed algorithms (GHS) compute MSTs in message-passing systems as overlay backbones.
Concrete Examples
Single-linkage clustering is literally "run Kruskal, stop early."
For k = 2, sort edges: 0.1, 0.2, 0.2, 0.3, 0.4, 2.5. Kruskal adds the first 5 edges and stops before 2.5; the resulting two components ({p1,p2,p3,p4}, {p5,p6,p7}) are the clusters. The excluded edge 2.5 is exactly the largest inter-cluster gap, certified by the cut property.
Metric TSP 2-approximation. Build an MST T of the complete graph on the cities; do a DFS preorder of T to produce a Hamiltonian tour. The tour's length is at most 2 * w(T) <= 2 * OPT, because removing any edge from an optimal tour gives a spanning tree.
MST here is A-B-C, B-D-E with weight 10; DFS preorder A,B,C,B,D,E,D,B,A gives a tour of at most 20.
Common Confusion / Misconceptions
- "Bottleneck shortest path equals bottleneck spanning tree." A BST minimizes the largest edge in a tree spanning all vertices; a bottleneck shortest path minimizes the largest edge along a single
s-tpath. Fortunately, any MST already minimizes the bottleneck along the unique tree path between every pair of vertices, so one MST answers both. - "MST is greedy, so Steiner tree should be greedy too." Steiner points (non-terminals included for cost savings) break matroid structure. Steiner tree is NP-hard.
- "Directed MST is just undirected MST on the underlying graph." No. The orientation constraint makes greedy strategies incorrect; Chu-Liu/Edmonds adds a cycle-contraction step.
- "Maximum spanning tree is NP-hard because maximization is." Only sum-of-edge-weights is maximized; flipping signs reduces to standard MST. Maximization of unrelated tree objectives (diameter, bandwidth) can be NP-hard, but "max total weight" is easy.
- "Any spanning tree is a valid 2-approximation for TSP." Only the MST gives the 2-approximation bound; an arbitrary spanning tree does not.
How To Use It
Given a problem that smells MST-like, match it to a variant:
- Does the problem require spanning all vertices? If not, you may be in Steiner-tree territory.
- Does it care about the sum of weights or the max weight? Sum -> MST; max -> bottleneck (still MST-reducible).
- Is the graph directed? Use minimum spanning arborescence (Chu-Liu/Edmonds).
- Are the points geometric with Euclidean distances? Use Delaunay + MST for
O(n log n). - Do you need a family of near-optimal trees? Use k-th MST or matroid swap structure.
- Are you picking cluster-count dynamically? Build the full Kruskal trace and post-filter; the MST alone encodes every single-linkage clustering.
- Under reliability scores (higher = better)? Max spanning tree via sign flip.
Transfer / Where This Shows Up Later
- S3 (systems). Data center cabling, rack-level routing, and cluster fabric topologies use MST variants; bottleneck MST drives worst-case latency planning.
- S4 (compilers). Register-allocation interference graphs and basic-block backbone extraction sometimes use MST-like subroutines.
- S5 (OS). Page-eviction hierarchies, memory pool merging, NUMA affinity trees.
- S6 (databases). Replication topology under WAN latency, sharding boundaries minimizing cross-shard joins, reliability-weighted max spanning trees for quorum design.
- S7 (DDD). Reducing context-map coupling: a minimum-cost spanning subgraph of the context graph picks the cheapest backbone of shared kernels and customer-supplier links.
- S8 (distributed). Gossip-overlay construction, multicast trees, BGP route-selection (not MST per se but similar greedy shape), STP in Ethernet.
Back-link to concepts 13-15 for the foundational cut/cycle properties and the two standard MST algorithms.
Check Yourself
- Explain why every MST is a minimum-bottleneck spanning tree.
- Sketch a 2-approximation for metric TSP starting from an MST.
- Why is Steiner tree harder than MST despite looking similar?
- How would you compute a maximum spanning tree using Kruskal unchanged, apart from a sign flip?
- What makes the directed MST (arborescence) problem require a different algorithm from Kruskal?
- If you already have an MST and the weight of one non-tree edge decreases, how cheaply can you update the MST?
Mini Drill or Application
Pick two of the variants and implement them:
- Second-best MST: compute an MST, then for each non-MST edge
e = (u, v), find the maximum-weight edge on the tree path betweenuandvand consider the swap. The minimum such swap cost gives the second-best MST. - Single-linkage clustering: build a Kruskal implementation that stops when
|V| - kedges have been added, and cluster a small 2D dataset (e.g., 2D Gaussian blobs) - visualize results. - Bottleneck
s-to-tpath: build an MST and walk the unique tree path fromstotto read off the minimum possible maximum edge. - Directed MST (Chu-Liu/Edmonds): implement the cycle-contraction algorithm on a small 6-vertex directed graph.
- Euclidean MST via Delaunay: build a Delaunay triangulation of 1000 random 2D points, then run Kruskal on the triangulation edges; verify the result matches an
O(n^2)brute-force.
Describe, for each variant, where in the argument the cut property or cycle property shows up.
Read This Only If Stuck
- ADM 6.1.4: Variations on MST
- ADM 6.1.2: Kruskal's Algorithm
- ADM: War Story - Nothing But Nets
- ADM: Chapter Notes on MSTs
- CLRS 21.1: Growing an MST
- CLRS 21.2: Kruskal and Prim
- Competitive Programming 4.3.4: MST Applications
- Sedgewick: Elementary Graph Algorithms (MST context)
- cp-algorithms: Second-best MST
- Sedgewick & Wayne, algs4 booksite: MSTs and clustering