Skip to main content

Greedy Needs an Exchange Argument

What This Concept Is

A greedy algorithm builds a solution by making the locally best-looking choice at each step and never reconsidering. The entire structure is a sequence of commitments; there is no rollback, no search, no reconsideration.

Greedy works when the problem has two properties (CLRS §15.2):

  1. Greedy-choice property. There exists a locally optimal choice that is part of some globally optimal solution.
  2. Optimal substructure. After making that choice, the remaining subproblem is the same kind of problem, and its optimal solution combined with the greedy choice is optimal overall.

Proving a greedy algorithm correct usually uses an exchange argument: take any optimal solution that disagrees with the greedy choice, swap the greedy choice into it, and show the resulting solution is still optimal (no worse, still feasible). Hence there is always an optimal solution that agrees with greedy at the first step; recurse.

A deeper framework, matroid theory (CLRS §15.5 in older editions / §16.4 onward), characterizes exactly which problems admit greedy solutions: those whose feasible sets form a matroid admit the standard "pick the best, check feasibility" greedy. For this module, the exchange argument is the operational tool; matroids are the background explanation.

Why It Matters Here

Greedy algorithms are fast (often $O(n \log n)$ after sorting) and short, but they are rarely obvious. Interval scheduling, Huffman coding, Kruskal's and Prim's MST algorithms, Dijkstra's shortest path, fractional knapsack, and optimal merge trees are greedy. Integer knapsack, longest common subsequence, and weighted interval scheduling are not. Without a correctness proof, it is easy to invent a plausible greedy rule that fails on an adversarial input - a silent bug.

Recognizing when to abandon greedy is as important as applying it. In the wild, greedy proposals are made constantly ("sort by ratio," "pick the smallest," "process in order"), and teams burn days debugging what was a categorical mistake from minute one.

Concrete Example

Example 1 (activity selection). Given $n$ activities with start times $s_i$ and finish times $f_i$, pick the largest set of mutually non-overlapping activities.

Greedy rule: repeatedly pick the activity with the earliest finish time among those compatible with what is already chosen.

Exchange proof sketch.

  • Let $a_1$ be greedy's first pick (earliest finish time overall).
  • Let $OPT$ be any optimal solution sorted by finish time; let $b_1$ be its first activity.
  • If $b_1 = a_1$, done. Otherwise $f(a_1) \le f(b_1)$ (greedy picked the earliest), so replacing $b_1$ by $a_1$ in $OPT$ still yields a valid, same-size solution: $a_1$ finishes no later than $b_1$, so any activity compatible with $b_1$ (starting after $f(b_1)$) is compatible with $a_1$ too.
  • Recurse on the remaining subproblem (activities starting after $f(a_1)$), which is the same kind of problem (optimal substructure).

Contrast: "pick the activity with the shortest duration" fails on $[0, 10]$, $[9, 11]$, $[10, 20]$: shortest-duration greedy picks $[9, 11]$ (length 2) and gets 1; earliest-finish greedy picks $[0, 10]$ and $[10, 20]$ and gets 2. This is exactly the kind of adversarial input that the exchange argument forces you to consider.

Example 2 (Huffman coding). Given character frequencies, build a prefix-free binary code of minimum expected length. Greedy: repeatedly merge the two lowest-frequency nodes into a new parent node; use a min-heap.

Exchange argument. In any optimal tree $T$, let $x, y$ be the two lowest-frequency characters. Swap them with the two deepest siblings in $T$ - the total cost does not increase because lower-frequency items moved deeper offsets higher-frequency items moving shallower. So there is an optimal tree where $x, y$ are siblings at maximum depth. Collapse them into a single symbol $z$ with frequency $f(x) + f(y)$ and recurse. Runtime $\Theta(n \log n)$ with a priority queue.

Example 3 (Kruskal's MST). Sort edges by weight; add each to the spanning forest if it doesn't form a cycle. Exchange: in any spanning tree not containing the current minimum weight edge $e$, adding $e$ creates a cycle with some heavier edge $e'$; swapping $e'$ for $e$ decreases total weight. Hence any MST contains a minimum weight edge; recurse. Runtime dominated by sort: $\Theta(m \log m)$.

Example 4 (fractional knapsack - works). Sort by $v_i / w_i$ descending, take as much capacity permits. Integer knapsack - fails; see concept 14.

Example 5 (Dijkstra's shortest path). Greedy rule: repeatedly settle the unsettled vertex with the smallest tentative distance; relax its outgoing edges.

Exchange argument (sketch). Suppose the first vertex $u$ settled with a distance $d[u]$ that is not the true shortest-path distance. Let $P$ be the true shortest $s \to u$ path and let $y$ be the first vertex on $P$ still unsettled. Then $d[y] \le d[u]$ (since $y$ comes earlier on a path no longer than the one implied by $d[u]$), contradicting greedy's choice of $u$. $\square$

Runtime with a binary heap: $O((V + E) \log V)$; with a Fibonacci heap (amortized $O(1)$ DECREASE-KEY): $O(E + V \log V)$. This is the classical place where amortized analysis (concept 4) changes an asymptotic class.

Example 6 (interval coloring - greedy works). Given intervals, assign each a color so overlapping intervals receive different colors, minimizing the number of colors. Greedy: sort intervals by start time; when processing an interval, assign the lowest-indexed color not used by any currently-active interval. Correctness: the number of colors used equals the maximum depth of overlap, which is a lower bound on any valid coloring - so greedy is optimal. Runtime $O(n \log n)$.

Common Confusion / Misconceptions

  • "Greedy means take the biggest/smallest thing." No, greedy means commit to a local choice that is provably compatible with some optimal solution. Different problems require different greedy rules (earliest finish, shortest distance, smallest weight, highest ratio). Guessing a rule without a correctness argument is how learners produce wrong answers on contests and in review.
  • "If a greedy works on the sample, it works." Counterexamples for a bad greedy usually require a little adversarial construction. Always check one case where the greedy's choice is tempting but provably suboptimal; if you cannot construct one, try harder before trusting the algorithm.
  • "Greedy is weaker than DP." Different tools. When greedy applies, it runs in $O(n \log n)$ and uses $O(n)$ memory; DP might need $O(n W)$ or $O(n^2)$ for the same problem shape. When greedy doesn't apply, DP is correct and greedy is wrong - cost versus correctness, not "weaker."
  • "Matroids are academic." Matroids are why we have Kruskal's algorithm, Prim's algorithm, and scheduling algorithms based on deadlines. Knowing the matroid abstraction tells you in advance that a greedy will work without an exchange proof per problem.
  • "Greedy never backtracks, so it never fails gracefully." Right - when greedy commits to a bad choice, the solution is wrong, not slow. That is why correctness proof is a prerequisite, not an epilogue.

How To Use It

To design a greedy algorithm:

  1. Propose a local rule (earliest finish, largest ratio, smallest weight, highest priority).
  2. State the exchange argument: given any optimal solution disagreeing with greedy, swap in the greedy choice without loss.
  3. Argue optimal substructure: the remaining subproblem is the same kind.
  4. Implement and test against a brute-force oracle for small inputs (concept 15).
  5. If the exchange does not go through, identify why: is there a trade-off that a single local choice cannot see? That trade-off is typically the DP state.
  6. If the problem fits a known matroid (edges of a graph, linearly independent vectors, schedulable jobs with deadlines), greedy works automatically.
  7. Sort cost is usually the dominant term: $O(n \log n)$ sort plus $O(n)$ greedy scan = $O(n \log n)$ total.

If you cannot write the exchange argument, your greedy is probably wrong. Falling back to DP (concept 14) is the standard escape.

Transfer / Where This Shows Up Later

  • S1 M5 (problem solving) introduced "commit, measure, iterate." Greedy is the algorithmic version: commit to the local rule, never revise.
  • S4 (systems) uses greedy scheduling for register allocation (heuristic graph coloring) and greedy cache replacement (LRU is greedy in "evict oldest unused"); greedy is optimal for offline caching under CLRS §15.4 analysis.
  • S5 (OS) runs greedy rate-monotonic scheduling and greedy deadline scheduling (EDF is provably optimal for preemptive single-processor scheduling - a matroid-style result).
  • S6 (databases) uses greedy join-order heuristics in query optimizers (left-deep tree heuristic) and greedy buffer-pool eviction.
  • S7 (architecture) analyzes architectural decisions by greedy: when the exchange argument goes through (the local choice cannot be improved without hurting a constraint), commit; otherwise, revisit with the full trade-off matrix.
  • S8 (distributed systems) uses greedy leader-election heuristics and greedy load-shedding at the edge.

Concept 14 (greedy vs DP) contrasts this page with the DP paradigm; concept 18 tells you how to decide which to use in practice.

Check Yourself

  1. State the exchange argument for activity selection in your own words.
  2. Give a problem where the "obvious" greedy rule is wrong; show the adversarial input.
  3. Why does fractional knapsack admit a greedy but $0/1$ knapsack does not?
  4. What is a matroid, informally, and why do matroid problems always admit greedy solutions?
  5. In Huffman coding, why do we pick the two lowest-frequency nodes, not a single lowest-frequency node?

Mini Drill or Application

For each problem, propose a greedy rule and either prove it optimal (exchange argument) or produce a counterexample.

  1. Coin change with coins ${1, 5, 10, 25}$ to make amount $T$.
  2. Coin change with coins ${1, 3, 4}$ to make amount $6$.
  3. Minimize the sum of completion times on a single machine (jobs have durations only).
  4. Interval coloring: assign each interval a color so no two overlapping intervals share a color, using the minimum number of colors.
  5. Minimum number of platforms needed at a train station given arrival/departure times.
  6. Implementation drill. Implement activity_select(activities) that returns a maximum-size set of non-overlapping activities. Derive complexity: sort by finish time $\Theta(n \log n)$, one pass $\Theta(n)$, total $\Theta(n \log n)$. Prove correctness by the exchange argument above and stress-test against a brute-force oracle for $n \le 12$.

Read This Only If Stuck