Choosing the Right Paradigm for Optimization Problems

What This Concept Is

When you face a new optimization problem, the right first question is not "what algorithm?" but "what paradigm?" The major candidates:

Brute force / backtracking. Correct, slow. Use for tiny inputs, as a reference, or to generate ground-truth for testing smarter algorithms.
Greedy. Fastest when it works; requires an exchange argument or other proof of correctness. Use for problems with the greedy-choice property (activity selection, MST, Huffman, Dijkstra, fractional knapsack).
Divide and conquer. Use when subproblems do not overlap (merge sort, quickselect, Strassen, FFT). DP is for when they do overlap.
Dynamic programming. Use when the problem has optimal substructure and overlapping subproblems. Memoization vs tabulation is a secondary decision.
Graph algorithms (Dijkstra, Bellman-Ford, BFS, topological DP, max-flow, matching). Use when the problem naturally lives on a graph, especially with cycles or weighted transitions.
Approximation / heuristic. Use when the exact problem is NP-hard and a provably-close solution is acceptable (set cover, metric TSP has a 3/2-approx via Christofides, bin packing has an FFD approximation).
Integer programming / LP. Use when constraints are linear and the problem does not fit a known combinatorial structure. LP solvers are surprisingly effective up to millions of variables; ILP is harder but tractable for many real instances.
Randomization / sampling. Monte Carlo, Las Vegas, and sketching algorithms are the right tool when exact methods are too slow and "answer within $\varepsilon$" is acceptable.

The discipline is to choose deliberately, with reasons, rather than pattern-match on a similar-looking problem.

Why It Matters Here

Most "I can't solve this DP" failures are not DP failures. They are paradigm failures: the problem was greedy, or it was shortest-path-on-a-graph, or it was NP-hard and no polynomial algorithm exists. Choosing the wrong paradigm wastes days. Choosing the right one shrinks the problem to a known template -- often to something you finished in a prior module.

This concept is the capstone of the module. Every earlier concept was a tool in the box; this one is how you pick the right tool without opening all of them.

Concrete Examples

Example 1: "minimum number of coins to make change for amount $A$."

Brute force: exponential.
Greedy (take largest denomination each time): wrong for $\mathrm{denoms} = {1, 3, 4}, A = 6$ (returns 3, optimal is 2).
DP over $A$: $O(A \cdot |\mathrm{denoms}|)$ -- works, assuming $A$ is not astronomical.
Graph reformulation: vertices are $0..A$, edges $i \to i + c$ for each coin $c$. BFS gives minimum hops; equivalent to the DP.

Same problem, multiple valid lenses. DP and BFS-on-DAG are equivalent here. Choose based on whether you also need the coin sequence (BFS gives it naturally via parent).

Example 2: "minimum spanning tree."

DP: does not naturally apply; no obvious "subproblem decomposition by prefix."
Greedy (Kruskal, Prim): exchange-argument-provable correctness via the cut property, $O(E \log V)$.

Right paradigm: greedy with a union-find.

Example 3: "shortest path with negative edges but no negative cycle."

Greedy (Dijkstra): incorrect on negative edges (counterexample: triangle with edges $a \to b: 2$, $a \to c: 5$, $b \to c: -10$; Dijkstra finalizes $c$ with $5$ before seeing the shortcut via $b$).
DP: Bellman-Ford is literally a DP over the number of edges used, $O(VE)$.
Right paradigm: DP / graph algorithm.

Example 4: "maximum subset sum not exceeding $S$" with $n \le 40$."

Bitmask DP: $2^{40}$ is too big directly.
Meet-in-the-middle: split into halves, enumerate subsets of each ($2^{20}$ each), binary-search across. $O(2^{n/2} \log 2^{n/2})$. Right paradigm: divide-and-conquer + binary search, not pure DP.

Example 5: "TSP on $n = 50$ cities."

Held-Karp DP: $2^{50}$ is intractable.
Christofides: 3/2-approximation for metric TSP in polynomial time.
Concorde: branch-and-cut solves instances up to 85,000 cities exactly in practice.
Right paradigm: approximation or specialized exact solver, not DP.

Common Confusion / Misconceptions

"Use DP when in doubt." A very common trap -- using DP when greedy would suffice, producing a correct but slower solution. The reverse is worse: using greedy when DP is required and producing a confidently wrong solution that passes your spot-check.

"Ignore NP-hardness because the inputs are small." TSP, subset sum with large numbers, longest simple path, graph coloring, vertex cover, and many scheduling problems have no known polynomial algorithm. If the input size is small, bitmask DP or branch-and-bound works; if it is large, you must move to approximation or heuristic. Skipping the NP-hardness check is how you spend a week trying to find a polynomial algorithm for vertex cover.

"Pseudo-polynomial = polynomial." Knapsack, subset sum, and coin change are pseudo-polynomial -- polynomial in the value of the input, not its bit length. For $W = 10^{18}$, $O(nW)$ is exponential in the input size. Move to approximation (FPTAS for knapsack) or reformulate.

"Pure paradigms only." Many real algorithms mix paradigms. Dijkstra is greedy; Bellman-Ford is DP; A* mixes greedy best-first with heuristic estimates. Branch-and-bound combines brute force with DP-style lower bounds. Knowing paradigm boundaries lets you recognize the mix.

"Every optimization problem has a known paradigm." Some are genuinely hard: dense subgraph, graph isomorphism, certain scheduling problems sit in complexity classes with no clean polynomial algorithm and no standard approximation. Acknowledging this early is better than forcing a paradigm fit.

How To Use It

Use this decision process on any new optimization problem:

Bound input size. What is the largest $n$? Answer dictates "polynomial? pseudo-polynomial? small exponential?"
Write brute force first in pseudocode. Often you can clarify the structure and get a test oracle.
Check substructure. Does an optimal solution decompose into optimal sub-solutions?
Check overlap. Does the recursion revisit subproblems? If yes -> DP. If no -> divide and conquer.
Check greedy choice. Can you prove an exchange argument? If yes -> greedy.
Check graph structure. Does the problem live on an explicit or implicit graph? Is that graph a DAG, weighted, with/without negative edges?
Check NP-hardness. If suspicious, try to reduce from a known NP-hard problem (3-SAT, Vertex Cover, Subset Sum). If hard, switch to approximation, exact exponential algorithms, or integer programming.
Check if a mix is needed. Meet-in-the-middle? Branch-and-bound? DP inside a search?

Transfer / Where This Shows Up Later

S3 Software Design. The Strategy pattern literalizes paradigm choice as a runtime decision. Testing at least two strategies against a reference oracle defends against silent greedy bugs.
S5 Networking. Routing protocols are paradigm-split: OSPF uses Dijkstra (greedy), BGP uses policy-driven heuristics, MPLS uses DP-flavored label switching. Choosing between them is the exact concept on this page applied to a networking problem.
S6 Databases. Query optimizers pick plan-enumeration strategies: DP (Selinger System R) for small $n$, genetic / randomized search for large $n$. The choice rule is exactly this concept.
S7 Architecture. Architectural decision-making uses this framework explicitly: "is this decision like TSP (hard, commit and approximate) or like MST (easy, prove correct greedy)?" ADRs should name the paradigm they assumed.
Phase 7 AI. Model selection, hyperparameter search, feature selection, and neural architecture search all face this decision: exact DP when small, Bayesian optimization / evolutionary search when large, random search as a baseline.

Check Yourself

State three problems where greedy is correct and three where DP is required.
What does it mean for a problem to be "pseudo-polynomial," and how does that affect paradigm choice?
If you suspect a problem is NP-hard, what is the next step before you give up on exact algorithms?
When would you prefer meet-in-the-middle over bitmask DP?
Name a problem where multiple paradigms give the same answer but differ in runtime or reconstruction convenience.
Give a situation where LP/ILP is the right tool and DP is not.

Mini Drill or Application

For each problem, choose the paradigm and justify in a sentence. Then implement your chosen approach and verify on small cases.

Given job durations and deadlines, minimize total tardiness.
Given a set of intervals on a line, find the minimum number of points that stab every interval.
Given a DAG with edge weights, compute the shortest path from $s$ to every vertex.
Given an array of $n = 100$ integers, decide whether there is a subset summing to target $T$ (with $T \le 10^5$).
Given a weighted undirected graph, find the minimum total weight of edges that keep the graph connected.
Given a string $s$ of length $10^5$, find the longest palindromic subsequence.
TSP on $n = 15$ cities -- choose between Held-Karp DP and brute force; justify.
Maximum matching in a bipartite graph -- choose between DP, network flow, and Hungarian algorithm; justify.

Name the paradigm before writing code. Explicitly rule out the alternatives in one sentence each.

What This Concept Is​

Why It Matters Here​

Concrete Examples​

Common Confusion / Misconceptions​

How To Use It​

Transfer / Where This Shows Up Later​

Check Yourself​

Mini Drill or Application​

Read This Only If Stuck​