Divide-and-Conquer: Split, Solve, Combine
What This Concept Is
Divide-and-conquer solves a problem of size $n$ by:
- Divide. Split the input into $a$ (nearly) independent subproblems, each of size roughly $n/b$.
- Conquer. Solve each subproblem recursively. A base case handles small $n$ directly.
- Combine. Merge the subproblem answers into the full answer in $f(n)$ time.
The resulting cost satisfies $T(n) = a \cdot T(n/b) + f(n)$, solved with the master theorem or a recursion tree (concept 3). The base case's cost is usually absorbed into the $f(n)$ term; when the base case fires far more times than you expect, revisit the split.
D&C is not the same as "any recursion." Three properties distinguish it from plain recursion or from dynamic programming:
- Balanced split (the subproblems are roughly the same size): this is what unlocks the logarithmic depth.
- Independent subproblems (no shared state or overlap): otherwise DP is the right tool, not D&C.
- Cheap combine relative to subproblem work: if combine is expensive, D&C does not beat a direct loop.
When all three hold, D&C gives both a clean correctness proof (strong induction on size, concept 6) and a clean complexity story (a single recurrence, concept 3).
Why It Matters Here
Mergesort, quicksort, binary search, closest-pair, Strassen's matrix multiplication, Karatsuba multiplication, FFT, order statistics (median-of-medians), most range-query structures, and many parallel algorithms are divide-and-conquer. The pattern is the most reliable way to beat $\Theta(n^2)$ on array problems when the operation has a clean combine step.
It is also the paradigm that maps most directly onto parallelism: independent recursive calls run in parallel, and the recurrence often rewrites to $T_\infty(n) = T_\infty(n/b) + \text{span}(f)$ for critical-path length. In S5-S8 this becomes the skeleton for fork-join schedulers, MapReduce, and tree-aggregation in distributed systems.
A smaller but practical benefit: D&C algorithms expose cache-oblivious structure. By halving the input until it fits in cache, you get near-optimal memory behavior without knowing the cache size - the recursion does the tuning for you.
Concrete Example
Example 1 (mergesort).
def merge_sort(A):
if len(A) <= 1:
return A
mid = len(A) // 2
left = merge_sort(A[:mid])
right = merge_sort(A[mid:])
return merge(left, right)
def merge(L, R):
out, i, j = [], 0, 0
while i < len(L) and j < len(R):
if L[i] <= R[j]:
out.append(L[i]); i += 1
else:
out.append(R[j]); j += 1
out.extend(L[i:]); out.extend(R[j:])
return out
- Divide: split the list in half, $O(1)$ indexing in real implementations.
- Conquer: sort each half recursively (IH gives sorted outputs, concept 6).
- Combine:
mergewalks both halves in parallel, producing a sorted result in $\Theta(n)$.
Recurrence: $T(n) = 2 T(n/2) + n$. Master theorem case 2: $T(n) = \Theta(n \log n)$.
Example 2 (maximum subarray, D&C). Split $A$ at the midpoint. The max-subarray either lives entirely in the left half, entirely in the right half, or crosses the midpoint. The first two are recursive. The crossing case: scan from midpoint leftward for the max suffix, rightward for the max prefix, sum them. Combine is $\Theta(n)$, recurrence $T(n) = 2 T(n/2) + n = \Theta(n \log n)$.
Kadane's algorithm in $\Theta(n)$ is faster in practice, but the D&C version parallelizes perfectly: each half runs on a different thread. This is the first problem where D&C's parallel story becomes visible.
Example 3 (counting inversions). Count pairs $(i, j)$ with $i < j$ and $A[i] > A[j]$. Modify mergesort: during merge, whenever you pick from the right half while the left still has elements, all remaining left elements form an inversion with the chosen right element. Total cost $\Theta(n \log n)$. Naive brute force is $\Theta(n^2)$.
Example 4 (Strassen's matrix multiplication). Split $n \times n$ matrices into four $n/2 \times n/2$ blocks, use 7 multiplications (not 8) at each level. Recurrence $T(n) = 7 T(n/2) + \Theta(n^2)$, master theorem case 1, $T(n) = \Theta(n^{\log_2 7}) \approx \Theta(n^{2.807})$ - the first sub-cubic matrix multiplication.
Common Confusion / Misconceptions
- "Dividing alone is D&C." Selection sort recursively finds the minimum and then sorts the rest. That "divides" the problem but yields $T(n) = T(n-1) + n = \Theta(n^2)$ because the split is unbalanced. The balanced split plus a linear combine is what unlocks the log factor.
- "D&C always beats iteration." Not for problems without a meaningful combine step. Summing a list via D&C is still $\Theta(n)$, slower in practice than a simple loop (larger constants, worse cache behavior). D&C earns its keep when the combine is strictly cheaper than solving from scratch.
- "Unequal splits are fine." Often they are not. $T(n) = T(n/10) + T(9n/10) + n$ still solves to $\Theta(n \log n)$ (Akra-Bazzi), but $T(n) = T(n-1) + n = \Theta(n^2)$ looks almost the same and is quadratic. The relative sizes matter.
- "Subproblems don't have to be independent." They do, for D&C. If subproblems overlap, you want DP (concept 12). The independence is what lets you solve each without consulting the other.
- "Recursive = slow." Only with pessimistic implementations. A well-written mergesort is competitive with any $O(n \log n)$ sort; the recursion overhead is dwarfed by the work.
- "D&C needs exactly two subproblems." No. $k$-way merge sort, FFT (two halves for radix-2, more for mixed-radix), and medians-of-medians (five groups) are all D&C. The master theorem handles any constant fan-out $a$.
How To Use It
For a new problem, ask:
- Can I split the input so the parts are independent, or nearly so?
- Is the split balanced? An $n/b$-plus-$n/b$ split is ideal; $n-1$-plus-$1$ is a red flag.
- If I had answers for the parts, could I combine them in less work than solving the whole thing from scratch?
- What recurrence does that give, and what does the master theorem say (concept 3)?
- Base case. What is the smallest $n$ for which I can return a direct answer?
- Correctness proof. Strong induction on size; see concept 6.
- Parallelize? If subproblems are independent, the two recursive calls run in parallel with critical path $T_\infty(n) = T_\infty(n/b) + \text{span}(f)$.
If the combine step is cheap (say $O(1)$ or $O(\log n)$), you often get $\Theta(n)$ or $\Theta(\log n)$ overall. If the combine is $\Theta(n)$, you typically land at $\Theta(n \log n)$. If the combine is $\Theta(n^c)$ with $c > \log_b a$, the combine dominates and you land at $\Theta(f(n))$ - master theorem case 3.
Transfer / Where This Shows Up Later
- S1 M5 (problem solving) framed "reduce to smaller instances." D&C is the structured reduction.
- S4 (systems) uses D&C for cache-oblivious algorithms: the recursive split naturally fits the memory hierarchy, and the recurrence $T(n) = 2 T(n/2) + n/B$ gives optimal block transfers.
- S5 (OS) schedules fork-join tasks whose structure is exactly a D&C recursion tree; work-stealing schedulers achieve optimal load balance on such trees.
- S6 (databases) uses D&C for external-memory merges (k-way merge is a multi-way D&C combine) and for parallel aggregation on sharded data.
- S7 (architecture) expresses latency fitness functions as D&C trees: "fan out to $k$ shards, each returns in $O(\log n)$, combine in $O(k)$ equals a request-response tree with bounded tail."
- S8 (distributed systems) runs MapReduce, tree aggregation, and spanning-tree broadcast as literal D&C recurrences over a network instead of over an array.
Concept 3 is the complexity tool for this page; concept 6 is the correctness tool; cluster 4 handles the case where subproblems overlap and D&C degrades.
Within this semester: S2 M2 builds on D&C for mergesort and quicksort; S2 M3 uses D&C-flavored recursion for tree traversals, Tarjan's SCC, and articulation-point detection; S2 M5 uses D&C for segment-tree builds ($T(n) = 2T(n/2) + O(1)$, hence $\Theta(n)$ build time).
Check Yourself
- Why does mergesort solve to $\Theta(n \log n)$ and not $\Theta(n^2)$, given that merging is $\Theta(n)$?
- Give an example where a D&C approach matches brute force (no improvement) because the combine step is too expensive.
- What is the recurrence for Strassen's matrix multiplication and why is it faster than the $\Theta(n^3)$ triple loop?
- Under what conditions does D&C parallelize with work $\Theta(n \log n)$ and span $\Theta(\log^2 n)$?
- Why is "quickselect" D&C but "quicksort" on one half only isn't - and what does that tell you about the recurrence?
- Show that the 2D closest-pair D&C algorithm of CLRS §33.4 satisfies $T(n) = 2T(n/2) + O(n)$, ignoring the "strip check." Why is the strip check's cost $O(n)$ and not $O(n^2)$?
- Parallel mergesort has work $\Theta(n \log n)$ and span $\Theta(\log^2 n)$. Why does merging two sorted arrays in parallel contribute a $\log n$ factor to the span?
Mini Drill or Application
Design a D&C algorithm, state the recurrence, and solve it.
- Find the maximum element of an array.
- Count the number of inversions in an array.
- Find the closest pair of points in a 1D array (given sorted).
- Given a sorted rotated array, find the pivot index.
- Multiply two $n$-digit integers in $o(n^2)$ using Karatsuba: $T(n) = 3 T(n/2) + n$.
- Implementation drill. Implement
merge_sort(A)+merge(L, R)in pseudocode. Derive the exact number of comparisons as a function of $n$, show that the recurrence $C(n) = 2 C(n/2) + (n - 1)$ gives $C(n) = n \log_2 n - n + 1$ for $n$ a power of 2, and confirm the $\Theta(n \log n)$ shape numerically.
Read This Only If Stuck
- Skiena: Mergesort (sorting by divide-and-conquer)
- Skiena: Divide and conquer
- Skiena: Solving divide-and-conquer recurrences
- CLRS: Designing algorithms (D&C template)
- CLRS: Strassen's algorithm for matrix multiplication
- Competitive Programming: Divide and conquer
- Grokking: Quicksort
- MIT 6.046J Spring 2015 - Divide and conquer lectures (Demaine; includes parallel D&C)
- CP Algorithms: Divide and conquer - closest pair (worked 2D closest-pair implementation)
- MIT 6.046J Spring 2015 - Divide & conquer II (FFT) (FFT as a D&C algorithm)
- Jeff Erickson, Algorithms - Divide and conquer chapter (free PDF; alternate exposition with careful recurrence derivations)