Skip to main content

Digit DP: Counting Numbers With Constraints

What This Concept Is

Digit DP answers questions of the form "how many integers in $[L, R]$ satisfy some digit-wise constraint?" by building candidate numbers digit by digit from the most-significant side, memoizing on a small set of parameters.

The universal reduction: compute $F(N)$ = count of valid integers in $[0, N]$, then answer $[L, R]$ as $F(R) - F(L - 1)$. This turns a two-bounded query into two one-bounded queries.

The standard state has three or four fields:

  • pos -- current digit position (0 is the leading digit).
  • tight -- boolean: is the prefix built so far equal to the prefix of $N$? If so, the next digit is capped at $N[\mathrm{pos}]$; if not, it ranges freely from 0 to 9.
  • leading_zero -- boolean: are we still inside leading zeros (so the number has not really "started")?
  • problem-specific -- sum of digits so far, last digit, residue modulo $k$, whether digit $d$ has appeared, count of a specific digit, parity, etc.

Transitions enumerate the next digit $d \in [0, \mathrm{limit}]$ where $\mathrm{limit} = N[\mathrm{pos}]$ if tight else $9$, update the state, and recurse.

Time complexity: $O(\mathrm{len}(N) \cdot 10 \cdot S)$ where $S$ is the product of sizes of extra state fields. For $N \le 10^{18}$ (19 digits) and sum-of-digits state ($\le 171$ values), total states are $\approx 19 \cdot 10 \cdot 171 \approx 3 \cdot 10^4$ -- tiny.

Why It Matters Here

Digit DP converts "count numbers with property $P$" problems that look combinatorial into compact polynomial DPs. Once you internalize the (pos, tight, ...) state template, most variants become fill-in-the-blanks exercises: just define the extra fields carried along.

It is also the cleanest example of why memoization (not tabulation) is preferred for structurally sparse DPs. The tight = True branch reaches only one specific cell per pos; memoizing it is wrong (the memoized answer depends on $N$). Only the tight = False subproblems are reusable across different tight prefixes. Tabulation would have to ignore most cells anyway.

Concrete Examples

Example 1: count integers in $[0, N]$ whose digit sum is exactly $S$, with $N \le 10^{18}$.

State: $dp[\mathrm{pos}][\mathrm{tight}][\mathrm{sum_so_far}]$. $\mathrm{pos}$ from 0 to $\mathrm{len}(N)$, $\mathrm{tight} \in {0, 1}$, $\mathrm{sum_so_far} \in [0, 9 \cdot \mathrm{len}(N)]$ so at most $9 \cdot 19 \approx 171$.

Recurrence:

def f(pos, tight, sum_so_far):
if pos == len(N):
return 1 if sum_so_far == S else 0
if not tight and memo[pos][sum_so_far] is set:
return memo[pos][sum_so_far]
limit = N[pos] if tight else 9
total = 0
for d in range(0, limit + 1):
total += f(pos + 1, tight and (d == limit), sum_so_far + d)
if not tight:
memo[pos][sum_so_far] = total
return total

Only memoize in the tight = False branch. Time $O(\mathrm{len}(N) \cdot 10 \cdot \max_\mathrm{sum}) = O(\log N \cdot 10 \cdot 171) \approx 3 \cdot 10^4$ states, each with 10 transitions.

Example 2: count integers in $[1, N]$ that contain at least one digit 7. Complement is easier: count numbers with no digit 7, then subtract from $N$.

State: $dp[\mathrm{pos}][\mathrm{tight}][\mathrm{leading_zero}]$. Transition: $d \in {0, 1, 2, 3, 4, 5, 6, 8, 9}$ (skip 7). Subtract result from $N$ to get the answer.

Example 3: count integers in $[L, R]$ divisible by $k$. State: $dp[\mathrm{pos}][\mathrm{tight}][\mathrm{residue}]$. Transition updates $\mathrm{residue} = (10 \cdot \mathrm{residue} + d) \bmod k$. Answer: $f_R(N=R) - f_L(N=L-1)$ where each $f$ counts integers with residue 0.

Example 4: count integers in $[0, N]$ with all distinct digits. State: $dp[\mathrm{pos}][\mathrm{tight}][\mathrm{used_mask}][\mathrm{leading_zero}]$. used_mask has 10 bits, one per digit. Note: without leading_zero, the prefix 001 would mark digit 0 as used twice and drop the count.

Tiny trace: $N=21$, count integers with digit sum exactly 3. Digits of $N$: [2,1]. Iterate pos=0 with tight=True: $d=0$ -> subproblem (pos=1, tight=False, sum=0) counts numbers in 00..09 with sum 3 -> ${3}$ = 1; $d=1$ -> (pos=1, tight=False, sum=1) -> digits 0..9 summing to 2 -> ${2}$ = 1; $d=2$ -> (pos=1, tight=True, sum=2) -> digit 0..1 summing to 3 - 2 = 1 -> ${1}$ = 1. Total: 3 numbers (3, 12, 21). ✓ Verified by brute force.

Common Confusion / Misconceptions

"Memoize the tight = True branch." This is wrong. tight = True depends on the exact digits of $N$ seen so far, which means the "same state" under a different $N$ would not share memo. Only the tight = False branch -- where the remaining suffix is free -- is reusable. Some solutions handle this by not memoizing at all when tight; others handle it by memoizing only when not tight. Either is correct.

"Forget leading_zero." For problems like "count numbers whose digits are all distinct," the prefix 000123 is really the 3-digit number 123; the leading zeros should not count as "digit 0 has been used." Omitting leading_zero is one of the top two digit-DP bugs.

"Digit DP is only about counting." No. Digit DP can compute sums ("sum of digit sums of all numbers in $[L, R]$"), XORs ("XOR of all valid numbers"), or even max/min with a suitable objective. The count is just the most common instance.

"Base 10 only." Digit DP generalizes to any base. Base-2 digit DP counts integers with constraints on their binary representation (e.g., Hamming-weight constraints).

"Two-bounded queries need special handling." They do not, if you remember $F(R) - F(L - 1)$. The only time this fails is when $L = 0$, where $L - 1 = -1$ is undefined; handle by $F(R) - F(L) + [L \text{ is valid}]$ or by using $[0, R] \setminus [0, L - 1]$ framing explicitly.

How To Use It

  1. Reduce $[L, R]$ queries to $F(R) - F(L-1)$ where $F(N)$ counts valid integers in $[0, N]$.
  2. Define state $(\mathrm{pos}, \mathrm{tight}, \mathrm{extras})$ where extras carries everything the constraint depends on.
  3. Write the recurrence enumerating the next digit.
  4. Memoize only the tight = False subproblems.
  5. Handle leading_zero if the constraint is digit-pattern-sensitive (distinct digits, "exactly one 7," etc.).
  6. Verify on small ranges by brute force ($N \le 10^4$) before scaling up.
  7. If you need sums (not counts), carry along a second accumulator alongside the count; transitions combine both.

Build the habit of writing out the state tuple before coding: (pos, tight, leading_zero, sum, residue, used_mask, last_digit). Most problems need only three of these; writing all candidates and crossing out the irrelevant ones prevents carrying bloat and prevents forgetting a necessary field.

Transfer / Where This Shows Up Later

  • S3 Software Design. Configuration validation (counting valid configs under constraints) has the same prefix-extension shape.
  • S5 Networking. IP address / subnet validation and enumeration under policy constraints is a digit-DP-style prefix walk.
  • S6 Databases. Cardinality estimation for range queries ("how many rows satisfy WHERE x BETWEEN L AND R and digit-sum-of-id = 10?") can use digit DP when the column has numeric constraints.
  • S7 Architecture. Counting valid policy sequences under per-step constraints is a digit DP in disguise -- each "digit" is a step, each constraint is a state field.
  • Phase 7 AI. Prefix-beam search in autoregressive language models resembles digit DP: the state is the prefix, transitions are token choices, pruning uses constraint satisfiability.

Check Yourself

  1. Why is $F(R) - F(L-1)$ the standard reduction for digit DP?
  2. Why must tight = True subproblems usually not be memoized?
  3. How would you add a "no two consecutive identical digits" constraint? What extra state field is needed?
  4. For "count numbers in $[0, N]$ with digit sum divisible by $k$," what is the full state tuple?
  5. Why is leading_zero a separate bit from tight?
  6. Generalize the framework to base $b$; what changes and what stays the same?

Mini Drill or Application

  1. Count integers in $[0, 10^6]$ with digit sum exactly $15$. Verify against a brute-force count.
  2. Count integers in $[1, N]$ ($N \le 10^{18}$) that contain at least one digit 7. Hint: count without any 7 and subtract.
  3. Count integers in $[L, R]$ with all-distinct digits ($N \le 10^9$). Use used_mask + leading_zero.
  4. Count integers in $[0, N]$ divisible by $k$. Add residue mod k to the state.
  5. Compute the sum of digit sums of all integers in $[0, N]$. Carry (count, sum) together.
  6. Count integers in $[L, R]$ whose binary representation has exactly $b$ ones (Hamming weight). Base-2 digit DP.
  7. Count integers in $[0, N]$ with no two adjacent equal digits. State needs last_digit and leading_zero.
  8. Count integers in $[0, N]$ whose digit-product is divisible by a target $k$; carry product_mod_k in the state. Be careful with digit 0 (sets product to 0, which is $\equiv 0 \pmod k$).
  9. Given a pattern of allowed digits per position (e.g., a regex like [0-9][13579]), count matches in $[L, R]$. State per position is already encoded in the allowed set; digit DP makes the counting $O(\log N)$.

Read This Only If Stuck

Before consulting external references: write out $F(N)$ for $N=9, 99, 999$ by brute force and compare against your digit DP. If all three match, the state and memoization structure is almost certainly correct. If any disagrees, 80% of the time the bug is in leading_zero, 15% in the memoization condition (tight == False only), and 5% in off-by-one on pos == len(N).