Skip to main content

MST Variants and Applications

What This Concept Is

The MST is the tip of a broader family. Useful variants and close relatives:

  • Maximum spanning tree. Replace min with max (or negate weights) - same algorithms, same proofs.
  • Bottleneck spanning tree. Minimize the maximum edge weight rather than the sum. Any MST is also a minimum-bottleneck spanning tree, so O(|E| log |V|) suffices; linear-time bottleneck trees exist via MBST reductions.
  • Second-best MST. The tree with second-smallest total weight; computable in O(|V||E|) by swapping exactly one MST edge for the minimum-cost substitute on the cycle closed by each non-MST edge.
  • k-th MST. Enumerate MSTs in nondecreasing order of weight using the swap structure (matroid exchange).
  • Steiner tree. Minimum-weight tree that spans a specified subset of "terminal" vertices (others optional). NP-hard; MST of the complete graph on terminals with shortest-path weights gives a 2-approximation (Kou-Markowsky-Berman).
  • Minimum spanning arborescence (Edmonds / Chu-Liu). Directed analogue; chosen edges form a rooted tree oriented away from a root. Polynomial despite directed setting.
  • Euclidean MST. MST of a point set in the plane; computable in O(n log n) via the Delaunay triangulation, which contains the EMST.
  • Degree-constrained MST. Find an MST where no vertex has more than k neighbors. NP-hard for k = 2 (that's Hamiltonian path).
  • Minimum bottleneck path. Not strictly a tree variant but any MST answers all pairwise minimum-bottleneck paths simultaneously along its unique tree paths.

Recognizing that a problem is "MST plus one twist" saves weeks of reinvention.

Why It Matters Here

MST-shaped problems appear throughout systems and optimization:

  • Network design. Cheapest wiring, pipelining, or fiber layout connecting all nodes.
  • Clustering. Remove the k-1 heaviest MST edges to produce k clusters maximizing the minimum inter-cluster gap (single-linkage clustering).
  • Approximation algorithms. MST-based 2-approximations for metric TSP and Steiner tree; 1.5-approximation (Christofides) starts from the MST.
  • Image segmentation. Felzenszwalb-Huttenlocher uses MST-style merging of pixel components via a threshold-on-edge-weight stop rule.
  • Bottleneck and reliability questions. Minimum-bottleneck paths reduce to MSTs; reliable network backbones often use a max-weight MST under reliability scores.
  • Leader election / broadcast trees. Distributed algorithms (GHS) compute MSTs in message-passing systems as overlay backbones.

Concrete Examples

Single-linkage clustering is literally "run Kruskal, stop early."

For k = 2, sort edges: 0.1, 0.2, 0.2, 0.3, 0.4, 2.5. Kruskal adds the first 5 edges and stops before 2.5; the resulting two components ({p1,p2,p3,p4}, {p5,p6,p7}) are the clusters. The excluded edge 2.5 is exactly the largest inter-cluster gap, certified by the cut property.

Metric TSP 2-approximation. Build an MST T of the complete graph on the cities; do a DFS preorder of T to produce a Hamiltonian tour. The tour's length is at most 2 * w(T) <= 2 * OPT, because removing any edge from an optimal tour gives a spanning tree.

MST here is A-B-C, B-D-E with weight 10; DFS preorder A,B,C,B,D,E,D,B,A gives a tour of at most 20.

Common Confusion / Misconceptions

  • "Bottleneck shortest path equals bottleneck spanning tree." A BST minimizes the largest edge in a tree spanning all vertices; a bottleneck shortest path minimizes the largest edge along a single s-t path. Fortunately, any MST already minimizes the bottleneck along the unique tree path between every pair of vertices, so one MST answers both.
  • "MST is greedy, so Steiner tree should be greedy too." Steiner points (non-terminals included for cost savings) break matroid structure. Steiner tree is NP-hard.
  • "Directed MST is just undirected MST on the underlying graph." No. The orientation constraint makes greedy strategies incorrect; Chu-Liu/Edmonds adds a cycle-contraction step.
  • "Maximum spanning tree is NP-hard because maximization is." Only sum-of-edge-weights is maximized; flipping signs reduces to standard MST. Maximization of unrelated tree objectives (diameter, bandwidth) can be NP-hard, but "max total weight" is easy.
  • "Any spanning tree is a valid 2-approximation for TSP." Only the MST gives the 2-approximation bound; an arbitrary spanning tree does not.

How To Use It

Given a problem that smells MST-like, match it to a variant:

  1. Does the problem require spanning all vertices? If not, you may be in Steiner-tree territory.
  2. Does it care about the sum of weights or the max weight? Sum -> MST; max -> bottleneck (still MST-reducible).
  3. Is the graph directed? Use minimum spanning arborescence (Chu-Liu/Edmonds).
  4. Are the points geometric with Euclidean distances? Use Delaunay + MST for O(n log n).
  5. Do you need a family of near-optimal trees? Use k-th MST or matroid swap structure.
  6. Are you picking cluster-count dynamically? Build the full Kruskal trace and post-filter; the MST alone encodes every single-linkage clustering.
  7. Under reliability scores (higher = better)? Max spanning tree via sign flip.

Transfer / Where This Shows Up Later

  • S3 (systems). Data center cabling, rack-level routing, and cluster fabric topologies use MST variants; bottleneck MST drives worst-case latency planning.
  • S4 (compilers). Register-allocation interference graphs and basic-block backbone extraction sometimes use MST-like subroutines.
  • S5 (OS). Page-eviction hierarchies, memory pool merging, NUMA affinity trees.
  • S6 (databases). Replication topology under WAN latency, sharding boundaries minimizing cross-shard joins, reliability-weighted max spanning trees for quorum design.
  • S7 (DDD). Reducing context-map coupling: a minimum-cost spanning subgraph of the context graph picks the cheapest backbone of shared kernels and customer-supplier links.
  • S8 (distributed). Gossip-overlay construction, multicast trees, BGP route-selection (not MST per se but similar greedy shape), STP in Ethernet.

Back-link to concepts 13-15 for the foundational cut/cycle properties and the two standard MST algorithms.

Check Yourself

  1. Explain why every MST is a minimum-bottleneck spanning tree.
  2. Sketch a 2-approximation for metric TSP starting from an MST.
  3. Why is Steiner tree harder than MST despite looking similar?
  4. How would you compute a maximum spanning tree using Kruskal unchanged, apart from a sign flip?
  5. What makes the directed MST (arborescence) problem require a different algorithm from Kruskal?
  6. If you already have an MST and the weight of one non-tree edge decreases, how cheaply can you update the MST?

Mini Drill or Application

Pick two of the variants and implement them:

  1. Second-best MST: compute an MST, then for each non-MST edge e = (u, v), find the maximum-weight edge on the tree path between u and v and consider the swap. The minimum such swap cost gives the second-best MST.
  2. Single-linkage clustering: build a Kruskal implementation that stops when |V| - k edges have been added, and cluster a small 2D dataset (e.g., 2D Gaussian blobs) - visualize results.
  3. Bottleneck s-to-t path: build an MST and walk the unique tree path from s to t to read off the minimum possible maximum edge.
  4. Directed MST (Chu-Liu/Edmonds): implement the cycle-contraction algorithm on a small 6-vertex directed graph.
  5. Euclidean MST via Delaunay: build a Delaunay triangulation of 1000 random 2D points, then run Kruskal on the triangulation edges; verify the result matches an O(n^2) brute-force.

Describe, for each variant, where in the argument the cut property or cycle property shows up.

Read This Only If Stuck