State Compression and When It Is Worth It
What This Concept Is
State compression is the practice of packing a complex piece of auxiliary state into a single integer (or a small tuple) so the DP table is indexable by a simple key. The two most common compressions:
- Subset-as-bitmask for small $n$ (used in bitmask DP): encode ${$elements seen / used / visited$}$ as one integer.
- Small-configuration-as-int for profile DP (a.k.a. "broken profile" or "broomstick" problems), e.g., tiling an $m \times n$ grid where $m \le 16$. Each row's compatibility pattern is encoded as one integer.
A broader family includes:
- Digit DP states (prior concept) -- the
(tight, leading_zero, extras)tuple is a compressed state. - Hash-based compression for arbitrary small-universe states.
- Run-length encoding of repeated elements when the recurrence only reads the current run.
State compression is worth it when:
- the hidden state truly is small (e.g., at most $n$ bits, $n \le 20$);
- without it, the alternative is exponential time, not just exponential state;
- the compressed state enables fast transitions (bit tricks,
popcount, submask iteration).
It is not worth it when:
- the hidden state grows with input size in an unbounded way;
- the natural problem structure is continuous or large-integer;
- you could get a polynomial DP with a different (non-compressed) state design.
Why It Matters Here
The decision "should I use a bitmask?" is the gatekeeper for a large family of problems. Picking it when the universe is small and transitions are clean turns an intractable search into a tractable DP. Picking it when $n = 30$ and $2^n = 10^9$ wastes your day.
Profile DP -- state-compression on a 2D grid -- extends the idea to row-by-row transitions on a grid where the cross-row coupling is bounded. It is the right tool for tiling, coloring, and permanent computations on narrow grids.
This concept also provides the vocabulary to recognize compression opportunities in later modules, where a seemingly continuous / large state sometimes compresses once you notice the true degrees of freedom are few.
Concrete Examples
Example 1: Profile DP -- tile an $m \times n$ grid with $1 \times 2$ dominoes ($m \le 16$).
Process the grid column by column. For each column $j$, the state is the set of cells in column $j$ that are already "half-filled" by a horizontal domino reaching from column $j - 1$. That set is a subset of $m$ cells -- one bit per row. State: $dp[j][\mathrm{mask}]$ = number of tilings of the first $j$ columns such that column $j$'s "already-filled" pattern is $\mathrm{mask}$.
Transition: enumerate the next column's placement pattern and check compatibility with $\mathrm{mask}$. Complexity $O(n \cdot 2^m \cdot \mathrm{transitions})$, fast enough for $m = 12$, borderline for $m = 16$.
For a $2 \times n$ grid (i.e., $m = 2$), $dp[n]$ = number of tilings = Fibonacci. For $4 \times n$, the count follows a well-known linear recurrence derivable by enumerating profile transitions.
Example 2: Submask iteration. Sometimes you want to iterate over all subsets of a mask $S$:
sub = S
while sub > 0:
# process sub
sub = (sub - 1) & S
# also process sub = 0 if needed
Total work over all $S \subseteq {0..n-1}$ is $O(3^n)$, not $O(4^n)$: the pair $(S, \mathrm{sub})$ enumerates "each element is either not-in-$S$, in-$S$-not-in-$\mathrm{sub}$, or in-$\mathrm{sub}$" -- three choices per element, giving $3^n$ total pairs.
Proof identity: $\sum_{S \subseteq [n]} 2^{|S|} = 3^n$. Each of $n$ elements contributes a factor of $3$ (not in $S$, in $S$ but not in submask, or in submask).
Example 3: SOS DP (Sum over Subsets). Compute, for each mask $S$, the sum $\sum_{T \subseteq S} f(T)$ in $O(n \cdot 2^n)$ instead of the naive $O(3^n)$. Uses a per-bit sweep where each bit's influence on all masks is added in one pass.
for i in range(n):
for mask in range(1 << n):
if mask & (1 << i):
dp[mask] += dp[mask ^ (1 << i)]
After this, dp[mask] contains the sum over all subsets of mask.
Common Confusion / Misconceptions
"My $n$ is 25, bitmask should still work." $2^{25} \approx 3 \cdot 10^7$, and most bitmask DPs have an inner $O(n)$ or $O(2^n)$ loop, pushing you to $10^9$ or $10^{15}$ ops. Rules of thumb:
- $n \le 20$ for $n \cdot 2^n$ transitions
- $n \le 16$ for $n^2 \cdot 2^n$ or $3^n$ submask work
- $n \le 22$ with careful constant factor and SOS DP
"State compression is lossless." A compressed state is exactly equivalent iff the remaining free variables of the original state are irrelevant to future transitions. If you notice you need "which specific element is in slot 3" and not just "the set of elements seen so far," the subset-as-bitmask compression loses information. Rework the state.
"Any small-universe state can be a bitmask." Compression needs a finite small universe. A problem whose "state" is a list of arbitrary items, a tree shape, or a range of continuous values does not naturally fit. Attempting to compress anyway produces brittle code that fails on edge inputs.
"Submask iteration is always $O(2^n \cdot 2^n) = O(4^n)$." No -- summed over all outer masks, it is $O(3^n)$. The identity above is the correct budget.
"SOS DP is a different topic." It is state compression's best companion: when you want to sum or combine information across all subsets, SOS DP does in $O(n \cdot 2^n)$ what naive submask iteration would do in $O(3^n)$. For $n = 20$: $2 \cdot 10^7$ vs $3.5 \cdot 10^9$. Big difference.
How To Use It
Decide in three steps:
- Is the hidden state a subset / small configuration? If it is a list, tree shape, or continuous value, the answer is usually no.
- Is $n$ small enough? Budget roughly $10^7$-$10^8$ total operations; check $n \cdot 2^n$ or $3^n$ against that.
- Can transitions be done with bit tricks? If you find yourself doing $O(n)$ or $O(2^n)$ work per transition, reconsider -- you may need submask iteration ($O(3^n)$) or SOS DP ($O(n \cdot 2^n)$).
If all three are yes, compress. Otherwise, find a different state design.
Additional checks:
- Write down what information the compressed state drops. Confirm future transitions do not need it.
- For profile DP, confirm the profile width (usually $m$) is bounded and small.
- Validate on a small case via brute force before scaling up.
Transfer / Where This Shows Up Later
- S3 Software Design. Bitfields in protocol and binary-format parsers, permissions, capability flags -- all compression patterns with the same "set-in-an-integer" discipline.
- S5 Networking. TCP/IP flag fields, BGP attribute bitmaps, feature-negotiation masks all exploit state compression for efficient representation and comparison.
- S6 Databases. Bitmap indexes compress "set of rows matching predicate $P$" into bitmasks; intersect/union via AND/OR is the compression's payoff.
- S7 Architecture. "Architecture feature matrix" as a bitmask over capability columns lets you express compatibility checks as bit operations on ADR metadata.
- Phase 7 AI. Tabular reinforcement learning on factored MDPs compresses factor combinations into integers; Boolean network synthesis uses SOS-DP-style sweeps for inclusion-exclusion counting.
Check Yourself
- Why is $n = 20$ a common ceiling for bitmask DP?
- Why does submask iteration sum to $O(3^n)$ across all $S$?
- Give one example where compression is tempting but wrong (the "natural" subset is not small enough).
- What does SOS DP compute, and what is its time complexity?
- Prove the identity $\sum_{S \subseteq [n]} 2^{|S|} = 3^n$.
- When does profile DP help on an $m \times n$ grid, and what is the ceiling on $m$?
Mini Drill or Application
- Count perfect tilings of a $4 \times n$ grid with $1 \times 2$ dominoes via profile DP. Verify against the known recurrence.
- Given an array of $n \le 18$ numbers, count the number of ways to partition it into two subsets with equal sum, using subset DP.
- For $n = 16$, generate all subsets of the full mask and use submask iteration to compute, for each subset, the sum of its elements. Compare runtime with the naive $O(n \cdot 2^n)$ approach.
- Explain in writing why a problem with $n = 40$ and a "visit every node" constraint is probably not a bitmask DP problem. Identify an alternative (approximation, heuristic, problem-specific structure).
- Implement SOS DP on an array $f$ of length $2^{18}$ and verify against brute-force submask summation for $n = 14$.
- Profile DP on a $6 \times n$ grid with forbidden cells -- adapt the dominoes recurrence to skip masks that cover blocked cells.
Read This Only If Stuck
- Competitive Programming: 8.3.2 Compilation of Common DP Parameters
- Competitive Programming: 8.3 More Advanced DP Techniques
- Competitive Programming: 8.3.4 MLE: Consider Using Balanced BST as Memo Table
- Skiena: 8.7 Limitations of Dynamic Programming (TSP)
- Competitive Programming: 3.5.3 Non-classical Examples
- cp-algorithms: Enumerating submasks of a bitmask
- cp-algorithms: DP on Broken Profile
- USACO Guide: Sum over Subsets DP