Longest Increasing Subsequence: O(n^2) and O(n log n) Patience

What This Concept Is

Given a sequence $a[1..n]$, the longest increasing subsequence (LIS) is a maximum-length subsequence whose values are strictly increasing. A subsequence is obtained by deleting zero or more elements while preserving order; it does not need to be contiguous.

Two standard algorithms:

$O(n^2)$ DP (classic tabulation). Let $L[i]$ = length of the longest increasing subsequence that ends at index $i$. Then

$$L[i] = 1 + \max\big{ L[j] ,:, j < i \text{ and } a[j] < a[i] \big},$$

with $L[i] = 1$ if no such $j$ exists. Answer is $\max_i L[i]$.

$O(n \log n)$ patience sorting (smarter state). Maintain an array $\mathrm{tails}$, where $\mathrm{tails}[k]$ is the smallest possible tail value of an increasing subsequence of length $k+1$ seen so far. For each $a[i]$, binary-search $\mathrm{tails}$ for the first value $\ge a[i]$ and replace it (or append if none). Length of $\mathrm{tails}$ at the end is the LIS length.

The $O(n \log n)$ version is not obviously a DP. It is: each update to $\mathrm{tails}$ corresponds to extending a specific DP state, and the binary search is finding the right state to update. The patience-sorting analogy (Aldous-Diaconis) is one way to make this vivid: deal cards left-to-right, place each onto the leftmost pile whose top is greater, and the number of piles at the end equals the LIS length.

Why It Matters Here

LIS is the canonical "same problem, two different state designs" example. The $O(n^2)$ formulation exposes the straightforward substructure: the subproblem is "LIS ending at $i$." The $O(n \log n)$ one shows that, with a smarter state (tails indexed by length, not position), the same DP becomes much faster. This pattern -- rethinking the state -- is the highest-leverage DP skill.

LIS also appears as a subroutine in higher problems: longest chain of pairs, envelope nesting, box stacking, and certain scheduling problems all reduce to LIS after a sort. Recognizing the reduction is the real win.

Concrete Examples

Input: $a = [2, 4, 3, 5, 1, 7, 6, 9, 8]$.

$O(n^2)$ trace.

i	1	2	3	4	5	6	7	8	9
$a[i]$	2	4	3	5	1	7	6	9	8
$L[i]$	1	2	2	3	1	4	4	5	5

LIS length = $5$, achieved e.g. by $2, 3, 5, 6, 8$ or $2, 3, 5, 7, 9$ or $2, 3, 5, 7, 8$.

$O(n \log n)$ trace. $\mathrm{tails}$ after each step:

$[2] \to [2,4] \to [2,3] \to [2,3,5] \to [1,3,5] \to [1,3,5,7] \to [1,3,5,6] \to [1,3,5,6,9] \to [1,3,5,6,8]$.

Final length $= 5$. Note that $\mathrm{tails}$ at the end is not a valid LIS -- it is a length-indexed summary. The last entry 8 was placed after the index of 9 had produced the previous tails, so the actual LIS cannot be $[1,3,5,6,8]$ in the original order (check: 1 appears at index 5, after 3,5,6 were seen). Reconstruction requires storing the index in the original array that produced each tail, plus a prev pointer.

Reduction example: Box Stacking. You are given $n$ boxes each with width $w_i$, depth $d_i$, height $h_i$. A box $A$ can be stacked on $B$ only if $w_A < w_B$ and $d_A < d_B$ (strict both dimensions). Find the maximum stack height. Reduction: sort boxes by $w$ ascending, breaking ties by $d$ descending (so no two boxes with equal $w$ are both picked), then run weighted LIS on $d$ where the weight of each box is its height $h_i$ and the objective is max total weight over a strictly-increasing subsequence. This is $O(n^2)$ weighted LIS -- same skeleton, different objective.

Memoized $O(n^2)$ formulation. The same DP can be written top-down:

from functools import lru_cache
@lru_cache(None)
def L(i):                          # LIS length ending at i
    best = 1
    for j in range(i):
        if a[j] < a[i]:
            best = max(best, L(j) + 1)
    return best
answer = max(L(i) for i in range(n))

Top-down is easier to write from the recurrence; bottom-up is easier to space-optimize and avoids recursion depth. Both produce the same $L[\cdot]$ array.

Common Confusion / Misconceptions

"Use the tails array as the LIS." It is not. tails records the best possible tail for subsequences of each length, but those tails may come from incompatible subsequences that could never be concatenated. To reconstruct an actual LIS, store, for each position $i$, the length of the LIS ending at $i$ and the index of the previous element in that LIS, and walk back from the position that achieved the maximum length.

"Strictly increasing vs non-decreasing is a stylistic choice." The difference is one operator and completely changes the answer. For strictly increasing, binary-search for first $\ge a[i]$ (lower_bound in C++, bisect_left in Python). For non-decreasing, binary-search for first $> a[i]$ (upper_bound, bisect_right). Get this wrong and the test suite may pass by luck.

"The $O(n \log n)$ trick only works for LIS." It generalizes to the "Longest X-respecting chain" family: sort by one dimension, LIS on the other. Envelope nesting, Russian-doll boxes, longest chain of pairs are all instances.

"LIS and LCS are the same." They are related but distinct. LIS of $a$ equals LCS of $a$ and $\mathrm{sort}(a)$ when $a$ has distinct elements; with ties the equivalence needs care. Do not mix the recurrences.

"Patience sorting is a different algorithm than DP." No -- it is the same DP, just with a state redefined as "best tail per length" instead of "best length per position." The binary search selects the correct predecessor length in $O(\log n)$ rather than scanning all $j < i$ in $O(n)$. This is the cleanest example in the module of the principle: rethinking state changes complexity, not correctness.

How To Use It

For $O(n^2)$ LIS:

Initialize $L[i] = 1$ and $\mathrm{prev}[i] = -1$ for all $i$.
For each $i$, look at all $j < i$; update $L[i] = L[j] + 1, \ \mathrm{prev}[i] = j$ whenever $a[j] < a[i]$ and this improves $L[i]$.
Return $\max_i L[i]$. Reconstruct by walking $\mathrm{prev}$ from $\arg\max$.

For $O(n \log n)$ LIS:

Initialize $\mathrm{tails} = []$, $\mathrm{tails_index} = []$ (index into $a$ producing each tail), $\mathrm{prev}[i]$ (index in $a$ of LIS predecessor of $i$).
For each $a[i]$, binary-search $\mathrm{tails}$ for the first $\ge a[i]$.
If found at position $k$: replace $\mathrm{tails}[k] = a[i]$, $\mathrm{tails_index}[k] = i$; set $\mathrm{prev}[i] = \mathrm{tails_index}[k-1]$ (if $k > 0$).
If not found: append $a[i]$ and set $\mathrm{prev}[i] = \mathrm{tails_index}[-1]$ (if non-empty).
Return $|\mathrm{tails}|$. Reconstruct by walking $\mathrm{prev}$ from $\mathrm{tails_index}[-1]$.

Transfer / Where This Shows Up Later

S3 Software Design. LIS is a recurring example for the Strategy pattern: same interface, two implementations (naive vs patience). Writing both lets you swap by problem size.
S5 Networking. Packet reordering detection sometimes uses LIS-like structure on sequence numbers to quantify "how out of order did this stream become?"
S6 Databases. Query optimizers occasionally compute an LIS-style longest monotonic run to decide whether a scan order is compatible with an index.
S7 Architecture. "What is the longest chain of ADRs that remain consistent under the current constraints?" is an LIS-over-decisions framing.
Phase 7 AI. LIS appears in sequence modeling as a baseline for measuring "monotonic attention coverage"; also as a feature engineering step in anomaly detection of near-sorted streams.

Check Yourself

Why does tails in the $O(n \log n)$ version not contain an actual LIS?
What one-character change in the binary search switches from strictly increasing to non-decreasing?
How would you count the number of distinct LIS, not just its length?
Prove that tails is always non-decreasing (so binary search is valid).
Reduce "longest chain of pairs $(a_i, b_i)$ with $b_i < a_{i+1}$" to LIS.
What is the LIS length of a random permutation of $[1..n]$? (Hint: Erdős-Szekeres bounds.)

Mini Drill or Application

Implement both versions and verify they agree on 20 random arrays of length 100.
Modify the $O(n^2)$ version to return one valid LIS (not just its length).
Modify the $O(n \log n)$ version to return one valid LIS (hint: store (tails_index, prev_index) per position).
Solve "maximum-length chain of pairs" (each pair $(a, b)$ with $b > a$; chain $(a_1, b_1), (a_2, b_2), \ldots$ requires $b_k < a_{k+1}$) by reducing to LIS.
Solve "Russian Doll Envelopes": sort by width ascending, by height descending within equal width, then LIS on heights.
Count the number of LIS of length equal to the LIS length (tie-count variant).
"Maximum sum increasing subsequence" -- replace the "+1" with "+a[i]" and track max sum, not max length.
Given $a$ and $b$, find the length of the longest common increasing subsequence. Combine LIS and LCS states into $dp[i][j]$ where $i$ indexes $a$ and $j$ indexes $b$.

Read This Only If Stuck

If the $O(n\log n)$ version will not click: write out the full $\mathrm{tails}$ evolution for a tiny input (5 numbers). Then, for each position $i$, ask "at the moment $a[i]$ was inserted, which entry in $\mathrm{tails}$ did it replace, and what does that replacement mean?" The answer -- "my new tail of length $k$ is better (smaller) than the previous best tail of length $k$" -- is the state redefinition, spelled out.

What This Concept Is​

Why It Matters Here​

Concrete Examples​

Common Confusion / Misconceptions​

How To Use It​

Transfer / Where This Shows Up Later​

Check Yourself​

Mini Drill or Application​

Read This Only If Stuck​