Graph Problem Recognition
What This Concept Is
Most real problems do not arrive labeled as graph problems. Turning a verbal or procedural problem into a graph question is an explicit modeling step:
- decide what counts as one vertex
- decide what kind of relation becomes an edge, and in which direction
- decide whether the relation carries a quantity (weight, capacity, cost)
- translate the target question into a standard graph problem: reachability, shortest path, spanning, flow, matching, coloring, cycle detection, and so on
- verify the reduction preserves the answer
Recognition is a skill, not a theorem. The more graph-shaped questions you have already seen, the faster you spot them. CLRS calls the canonical set "the seven standard problems": reachability, shortest path, minimum spanning tree, max flow, matching, topological order, and cycle detection. Almost every exam problem reduces to one of these.
Why It Matters Here
Skiena's line - "design graphs, not algorithms" - is the controlling idea of this module. Once the graph is right, the textbook algorithms handle the rest.
Concrete payoffs:
- A scheduling problem with precedence constraints becomes "topological order on a DAG."
- A project-selection problem becomes "min-cut on a carefully built flow network."
- An image-segmentation problem becomes "min-cut on a grid graph with data terms."
- A task-to-machine assignment becomes "bipartite matching."
- "Arbitrage detection" across currencies becomes "negative cycle detection with log-transformed weights."
- "Fewest keystrokes between two words" becomes "shortest path on a grid of edit states."
The algorithm does not change; the problem's clothes do.
Concrete Example
You are given a list of words like ["wrt", "wrf", "er", "ett", "rftt"] that appear in a foreign dictionary in sorted order. Find an ordering of the foreign alphabet consistent with the list.
Direct attempt: search all permutations of the letters used - exponential.
Graph-first attempt: for each adjacent pair of words, find the first differing character (a, b) and record "a must come before b" as a directed edge a -> b. You now have a directed graph whose vertices are the letters used.
- If the graph has a cycle, no consistent order exists.
- If the graph is a DAG, any topological order of it is a valid alphabet. For the edges above, one legal order is
w, e, r, t, f.
The problem is not intrinsically graph-shaped, but once you see the precedence edges, it becomes standard.
Second worked example - word ladder: given BEGIN and END and a dictionary, change one letter at a time through valid words. Model: vertex = valid word, edge = differs by exactly one letter (undirected, unit weight). Target = shortest path. Algorithm = BFS. The "word-ladder" problem is just BFS in disguise.
Common Confusion / Misconceptions
- "Shortest path, so Dijkstra." Dijkstra needs nonnegative weights, a single source, and additive path cost. If any of those is missing, the algorithm is wrong even if it compiles. (Negative edges -> Bellman-Ford; bottleneck cost -> MST walk; APSP -> Floyd-Warshall.)
- "Everything is a graph problem." Forcing a graph onto a problem that is really array-DP or interval-scheduling wastes effort. If there is no natural pairwise relation, stop modeling.
- "Vertices must be the obvious objects." Often the right vertex is a state (a knight's position, a game configuration, a pair
(city, fuel-left)) not a raw input item. Expanding the state space is a standard trick. - "Once modeled, the edges write themselves." You still need to sanity-check direction, weight, multi-edges, and self-loops before declaring the graph fixed.
How To Use It
Use this recognition checklist before coding:
- Can I describe two entities that stand in a relation? (vertex candidates, edge candidates)
- Does direction matter in that relation? (directed vs undirected)
- Is there a quantity attached to the relation? (weights, capacities)
- What is the target question in graph vocabulary? (reach, shortest, spanning, cut, matching, cycle, color)
- Is this a standard problem, or a reduction to one?
- What structural shortcut applies (DAG, bipartite, planar, tree, unit weights)?
- Can I write a one-sentence specification of
(V, E, weights, target)and commit to it before touching code?
If step 5 fails, you may be trying to model a non-graph problem.
Transfer / Where This Shows Up Later
Graph-problem recognition is the single most transferable skill in the semester.
- S3 (systems). Recognizing that linker symbol resolution is reachability + topological order turns a fuzzy "why does my build break?" into a concrete graph query.
- S4 (compilers). Dead-code elimination is reachability from entry; loop detection is back-edge detection in the CFG; register allocation is graph coloring on the interference graph.
- S5 (OS). Deadlock detection is cycle detection in the wait-for graph; CPU scheduling uses DAG-longest-path for critical paths.
- S6 (databases). Query optimization is shortest-path-like search over a DAG of plan alternatives; foreign-key cascades are reachability.
- S7 (architecture, DDD). Context-mapping exercises are graph modeling in disguise; fitness functions then check the modeled graph.
- S8 (distributed). Consistent-hashing rebalance is edge-swap on a graph; leader election is MST-like; gossip protocols simulate BFS.
Back-link to S1 M2 where relations on sets first introduce edges, and S2 M2 where DP over arrays is contrasted with DP over DAG vertices.
Check Yourself
- Why is "design the graph first" usually better than "pick the algorithm first"?
- Give a problem that looks non-graph but reduces to shortest paths.
- Give a problem that looks graph-like but is better solved with DP or greedy on arrays.
- What makes a problem "really a flow problem" rather than a shortest-path problem?
- Give a problem where the natural vertex set is states, not the input objects themselves.
Mini Drill or Application
Translate each into a graph problem (vertices, edges, direction, weights, target):
- Given a set of compile steps with dependencies, produce a legal build order.
- Given currency-pair exchange rates, decide whether risk-free arbitrage exists.
- Given a set of movies and actors, decide whether a given actor connects to Kevin Bacon in at most
khops. - Given a grid of land and water cells, count the number of islands.
- Given a list of people and conflicting time slots, assign time slots so no conflict exists.
- Given a two-player game where each state is a board, decide who wins with perfect play.
- Given a word list and a pair of words, find the shortest ladder between them.
For each, write one sentence stating which standard algorithm then applies.
Portfolio drill. Keep a running "graph problems I have solved" journal. Each entry gets: problem statement (3 sentences), graph model, standard algorithm used, one trap you hit. After 20 entries, the recognition pattern is internalized.
Read This Only If Stuck
- ADM 6.6: Design Graphs, Not Algorithms
- ADM 5 (ch intro): Graph Traversal overview
- ADM 5.4: War Story - Getting the Graph
- ADM 5.4 (Part 2): War Story - Getting the Graph
- ADM 6.4: War Story - Dialing for Documents
- Competitive Programming 4.1: Overview and motivation
- Competitive Programming 4.7: Special graphs
- Competitive Programmer's Handbook (cses.fi) - graph chapters
- Sedgewick & Wayne, algs4 booksite: Graphs