Reconstructing the Solution From a DP Table
What This Concept Is
A DP computes the value of an optimum, for example the minimum edit distance, the longest LIS length, or the maximum knapsack value. But in real applications you usually need the object that achieves that value -- the actual edit script, the actual subsequence, the actual items chosen, the actual parenthesization.
Reconstruction is the process of reading back an optimal object from the filled table. The standard techniques:
- Parent pointers (eager). Store, along with each
dp[state], the choice that achieved it -- which branch of themax/minwon, or which predecessor was used. Walk back from the final state, following choices. - Retrace by comparison (lazy). Scan the filled table at the end and, at each cell, ask which predecessor could have produced this value by inverting the recurrence. The first one that matches is a valid previous step.
- Path recovery from auxiliary arrays. Problems like LIS maintain a separate
prev[i]array alongside the value DP; walkprevbackward from the argmax.
Either technique gives you a single optimal object. Counting all optimal objects is a different (and usually harder) computation: you replace the max with a sum over branches that tie. Finding the lexicographically smallest optimal object adds tie-break rules at each branch.
Reconstruction is orthogonal to memoization vs tabulation. Both styles support both techniques; only the mechanics change.
Why It Matters Here
A DP that returns only the optimal value is often not enough. An edit-distance routine that returns 7 but cannot say "insert, match, delete, substitute, match, match, delete" is useless for diff tools. A knapsack that returns maximum value but not the list of items is useless for a packer. A matrix chain that returns 4500 scalar multiplications but cannot produce (A1 (A2 A3)) is useless as a compiler pass.
Reconstruction is part of being done, not an optional extra. It is also where rolling-array space reductions collide with product requirements: the fastest version often loses the information you need, forcing you to keep a choice log.
Concrete Examples
0/1 knapsack reconstruction. Let $dp[i][w]$ be the best value using the first $i$ items with capacity $w$. Recurrence:
$$dp[i][w] = \begin{cases} \max\big( dp[i-1][w],\ dp[i-1][w - w_i] + v_i \big), & w_i \le w \ dp[i-1][w], & w_i > w \end{cases}$$
After filling the table, walk back:
items = []
w = W
for i from n down to 1:
if dp[i][w] != dp[i-1][w]:
items.append(i) # item i was taken
w -= weight[i]
return reversed(items)
The test $dp[i][w] \ne dp[i-1][w]$ asks "would the answer have been worse without item $i$?" If yes, $i$ must have been taken. Decrement capacity and continue. This is lazy reconstruction -- no extra bookkeeping.
LIS reconstruction (basic $O(n^2)$, eager). Maintain $\mathrm{prev}[i]$ = the index that $i$ extends. When you update $L[i]$, also set $\mathrm{prev}[i] = j$. Find $i_{\text{end}} = \arg\max L$, then walk $\mathrm{prev}$ backward to the start to recover the subsequence in reverse.
Edit distance reconstruction (eager with choice). At each $(i, j)$, store $c[i][j] \in {\uparrow, \leftarrow, \nwarrow}$ for which branch of the three-way min won (delete, insert, match/substitute). Walk from $(m, n)$ to $(0, 0)$ emitting operations; reverse.
Matrix chain reconstruction. Alongside $dp[i][j]$, store $s[i][j]$ = the split point $k$ that minimized. Recursive routine to print:
def print_parens(i, j):
if i == j: return "A" + str(i)
k = s[i][j]
return "(" + print_parens(i, k) + " * " + print_parens(k+1, j) + ")"
Worked knapsack walk-back. Items $(w,v) = (2,3),(3,4),(4,5),(5,6)$, $W=5$. Full table $dp[i][w]$ (rows $i=0..4$, columns $w=0..5$):
| 0 | 1 | 2 | 3 | 4 | 5 | |
|---|---|---|---|---|---|---|
| 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 1 | 0 | 0 | 3 | 3 | 3 | 3 |
| 2 | 0 | 0 | 3 | 4 | 4 | 7 |
| 3 | 0 | 0 | 3 | 4 | 5 | 7 |
| 4 | 0 | 0 | 3 | 4 | 5 | 7 |
Walk back from $(4,5)$: $dp[4][5]=dp[3][5]$, so item 4 skipped; $dp[3][5]=dp[2][5]$, so item 3 skipped; $dp[2][5]\ne dp[1][5]$ (7 vs 3), so item 2 taken, $w=5-3=2$; $dp[1][2]\ne dp[0][2]$, so item 1 taken, $w=0$. Reconstructed items: ${1,2}$ with value $3+4=7$. ✓
Common Confusion / Misconceptions
"Reconstruction after the fact is always fine." It is fine for small tables but becomes brittle when the recurrence has many cases or ties. The robust habit is to record the choice at the same time you compute the value. A choice[state] array that stores "which branch of the max/min won" turns reconstruction into a straight walk.
"Ties do not matter." They do when the downstream code expects a specific tie-break (e.g., lexicographically smallest edit script). If you do not decide a tie-break, your reconstruction will pick one arbitrarily -- consistent per-implementation, but not reproducible across implementations.
"Rolling arrays still support reconstruction." Usually no. Rolling collapses the information you need to reconstruct. Either keep the full table, or log the choices into an auxiliary structure ($O(n)$ extra per row may be fine even when rolling the values themselves).
"One optimum is enough." For ambiguity-sensitive applications (diff visualization, alignments, phylogeny) you may need all optimal objects or a random one. Counting optima is a separate DP; sampling uniformly among them is another.
How To Use It
At recurrence-writing time:
- For each
max/minover branches, record which branch was chosen (as an index, tag, or enum). - For base cases, record a sentinel marker ("this is the start") so the walk-back terminates cleanly.
- At the end, start from the final state and walk the choice array back to the base, collecting choices.
- Reverse the collected choices to present them in forward order.
- If space is tight, consider Hirschberg-style divide-and-conquer reconstruction, which recovers the alignment in $O(mn)$ time and $O(\min(m, n))$ space for edit distance / LCS.
- Decide the tie-break rule before you ship -- not after a user reports inconsistency.
- Test reconstruction with an "apply the returned object and recompute" oracle: the reconstructed edit script applied to
smust producet; the reconstructed item list must fit in the knapsack and sum to the reported value.
Transfer / Where This Shows Up Later
- S3 Software Design. Reconstruction is the Memento pattern: the DP's intermediate table is a memento of the optimization trajectory. Decorators wrap "compute value" and "reconstruct object" as separable concerns.
- S5 Networking. Dijkstra's
prevmap is identical reconstruction machinery: you compute shortest-path distances, butprevlets you emit the path. - S6 Databases. A Selinger plan's final output is itself a reconstructed tree from the dynamic-programming table of sub-join plans.
- S7 Architecture. ADRs often need to trace why an optimum was chosen -- reconstruction is the analogy: the value is the decision, the trace is the argumentation.
- Phase 7 AI. Reinforcement learning's policy is reconstruction of the Bellman value function: $\pi^(s) = \arg\max_a [R(s,a) + \gamma V^(s')]$ -- the same pattern.
Check Yourself
- Why is the value alone insufficient for edit distance in practice?
- How would you modify a tabulated LIS to report the subsequence, not just its length?
- If two branches tie, does it matter which you record? When might it?
- Why does rolling-array space reduction usually forbid naive reconstruction?
- How does Hirschberg's trick recover a full alignment in $O(\min(m, n))$ space?
- What is the difference between "a random optimum" and "the lexicographically smallest optimum"?
Mini Drill or Application
Take each of the following and add reconstruction:
- Knapsack 0/1: return both maximum value and the list of chosen items.
- Edit distance: return the minimum cost and the sequence of operations (
insert,delete,substitute,match). - Rod cutting: return the optimal revenue and the list of piece lengths.
- Matrix chain: return both the minimum scalar count and a parenthesization string.
- Longest common subsequence: return both the length and one valid LCS string.
For each, verify with at least one hand-computed small example that (a) the returned value is optimal and (b) the reconstructed object actually achieves that value. Add a property-based test that fuzzes random inputs and checks this invariant.
Tie-break harness. After you have one reconstruction, write a second that always picks the lexicographically smallest choice at each tie point (e.g., in edit distance, prefer delete < insert < substitute). Run both on 20 random inputs. You should see identical costs but sometimes different scripts. This exercise cements the mental model that reconstruction is a choice policy layered on a value DP, not an afterthought.
Read This Only If Stuck
- Skiena: 8.2.3 Reconstructing the Path
- CLRS: 14.4 Longest Common Subsequence (Part 3)
- CLRS: 14.5 Optimal Binary Search Trees (Part 4)
- CLRS: 14.1 Rod cutting (Part 3 -- parent-array reconstruction)
- Competitive Programming: 3.5.1 DP Illustration
- cp-algorithms: Knapsack (reconstruction pattern)
- Competitive Programmer's Handbook (Laaksonen) -- free PDF, DP chapters