Quicksort and Why Randomization Helps

What This Concept Is

Quicksort sorts by choosing a pivot, partitioning the array so that elements (\leq) pivot sit on the left and (>) pivot on the right, and recursing on each side.

def quicksort(A, lo, hi):
    if lo >= hi:
        return
    p = partition(A, lo, hi)
    quicksort(A, lo, p - 1)
    quicksort(A, p + 1, hi)

Expected running time with a random (or randomized) pivot is (\Theta(n \log n)). Worst case is (\Theta(n^2)) when the pivot is consistently extreme (for example, always the minimum or maximum). It sorts in place with (\Theta(\log n)) expected stack depth and (\Theta(n)) worst-case stack depth -- which is why introsort caps recursion.

There are three partition schemes worth distinguishing:

Lomuto. One-pointer scan; simple to code, more swaps, poor on duplicates.
Hoare. Two-pointer scan from both ends; fewer swaps on average, subtle off-by-one handling, the standard for performance-tuned code.
3-way (Dutch National Flag / Bentley-McIlroy). Splits into < pivot, = pivot, > pivot; turns (\Theta(n^2)) into (\Theta(n \log n)) on inputs with many duplicate keys.

Quicksort is not stable by default (partition reorders equal elements) and has expected comparison count (\sim 2n \ln n \approx 1.39,n \log_2 n) -- about 39% more than mergesort's average, but with smaller per-comparison constants because it touches memory sequentially.

Why It Matters Here

Quicksort is the practical default sort in most languages' standard libraries (C qsort, C++ std::sort with introsort fallback, Rust slice::sort_unstable is pattern-defeating quicksort). Python's list.sort uses Timsort, but internal sorts often use quicksort-shaped logic. Three ideas make it worth studying directly:

partition as a reusable primitive (used in quickselect, Dutch National Flag, 3-way partition, nth_element, find-the-median-of-medians)
the role of randomization in converting "bad input" into "bad luck"
how pivot choice -- first, last, median-of-three, random, ninther -- changes expectations without changing the algorithm

If you only know mergesort, you do not yet understand why practical systems accept a worst case they avoid in probability.

Concrete Examples

Example 1 -- Lomuto partition around last element. Partition [3, 7, 1, 4, 5, 2, 6] around pivot 4. Rotate 4 to the end: [3, 7, 1, 5, 2, 6, 4]. Walk with i tracking "end of <= pivot region":

<= 4 ; swap A[i], A[j]; i++      -> [3, 7, 1, 5, 2, 6, 4]
> 4  ; do nothing
<= 4 ; swap                      -> [3, 1, 7, 5, 2, 6, 4]
> 4  ; do nothing
<= 4 ; swap                      -> [3, 1, 2, 5, 7, 6, 4]
> 4  ; do nothing
final swap A[i], pivot             -> [3, 1, 2, 4, 7, 6, 5]

Pivot 4 now sits at index 3, smaller on the left, larger on the right. Recurse into [3, 1, 2] and [7, 6, 5].

Example 2 -- expected comparison count via indicator variables. Let (X_{ij}) be the indicator that the (i)-th smallest and (j)-th smallest are ever compared. In randomized quicksort they are compared iff one of them is the first pivot chosen from the range ([i, j]), which happens with probability (\tfrac{2}{j - i + 1}). So

[ \mathbb{E}[\text{comparisons}] = \sum_{i < j} \frac{2}{j - i + 1} = 2 \sum_{k=2}^{n} \frac{n - k + 1}{k} = 2n H_n - O(n) \sim 2 n \ln n. ]

This is the cleanest probability-meets-algorithms argument in the CLRS canon, and it reuses exactly the linearity-of-expectation machinery you will apply to hash tables in concept 11.

Common Confusion / Misconceptions

"Randomized quicksort has worst case (n \log n)." No. The worst case is still (\Theta(n^2)). What changes is that no adversary can force that worst case by choosing a bad input; the worst case now requires bad coin flips, whose probability decays as roughly (e^{-\Omega(n)}).

"Median-of-three removes the need for randomization." Median-of-three helps on real data but still has (\Theta(n^2)) adversarial inputs -- McIlroy's "killer adversary" construction demonstrates this against the classic Bentley-McIlroy qsort. Only true randomization removes the adversary. For libraries that cannot randomize (reproducibility requirements), introsort's depth limit is the safety net.

"Quicksort sorts equal keys gracefully." Naive Lomuto partition on an array of all-equal keys is (\Theta(n^2)) because every element goes to one side. Use 3-way partitioning when duplicates are common -- it is what Java's Dual-Pivot Quicksort essentially does.

"Recursion depth is (\log n)." Expected depth is (\log n); worst-case depth is (n), which blows the stack on deep arrays if recursion is unbounded. Introsort switches to heapsort once depth exceeds (2 \log_2 n) to cap this.

"Randomized quicksort and randomized pivot are the same thing." Two subtly different knobs. Random pivot picks a uniform index inside [lo, hi] for each partition. Shuffle-then-deterministic-quicksort shuffles the array once up front, then partitions with a fixed rule. They give the same expected asymptotics, but the shuffle variant reads the input twice and the per-partition variant reads it once -- the second matters when the comparator is expensive or the array is streamed.

How To Use It

Use quicksort when:

you care about average-case speed, not worst-case guarantees
you need in-place sorting with low memory
the data is large enough that cache friendliness matters (quicksort beats mergesort on locality)
you can tolerate non-stability, or you sort opaque values with no tie-breaking semantics

When implementing or tuning quicksort:

randomize the pivot or shuffle the input first -- never trust real data to be random
switch to insertion sort on small subarrays (n < 16)
use 3-way partitioning for inputs with many duplicate keys
cap recursion depth (introsort) to guarantee (\Theta(n \log n)) worst case
recurse into the smaller partition and iterate over the larger -- keeps stack depth (\leq \log_2 n)

For worst-case guarantees, use introsort: quicksort that switches to heapsort once recursion depth exceeds (c \log n).

Transfer / Where This Shows Up Later

S2 M2 (this module). Partition is the engine of quickselect (concept 09) and the Dutch National Flag partitioning used in radix-like tricks.
S2 M5 (advanced structures). Treap operations use the same randomization argument -- the randomness lives in the data structure's priorities, not in the algorithm.
S3 (software design). "Choose a pivot, recurse on each side" is the same shape as many tree-recursive design patterns (divide-and-conquer in a Visitor, for example).
S4 (systems programming). Cache-friendly partition passes inform how you iterate over structs-of-arrays vs arrays-of-structs for SIMD sorting.
S5 (OS). Randomized scheduling (e.g., lottery scheduling) uses the same "convert adversarial input into bad luck" trick as randomized quicksort.
S6 (databases). Query optimizers use sample-based pivot selection for histogram building and for picking shard boundaries -- the quicksort partition primitive in disguise.
S7 / S8 (architecture, system design). Randomization as a defense against adversarial input reappears in consistent hashing (random salt per node) and in load shedding (random-early-drop).

Check Yourself

What partition shape gives quicksort its worst case, and how often does it happen for a random pivot?
Why does a random pivot give expected (\Theta(n \log n)) even though the worst case is still (\Theta(n^2))?
What goes wrong with first-element pivot choice on an already-sorted array?
When does 3-way partitioning help, and by how much on an all-equal-key input?
Why do production libraries cap recursion depth instead of "just trusting the randomization"?

Mini Drill or Application

Implement Lomuto partition on paper for [9, 4, 6, 2, 7, 1, 8, 3, 5] with pivot 5.
Derive the expected-comparison-count recurrence for random-pivot quicksort and solve it to get (\sim 2 n \ln n).
Design one input that makes "median-of-three" quicksort still take (\Theta(n^2)) (hint: alternate small and large values to keep the chosen median extreme).
Implement 3-way quicksort and compare it against 2-way Lomuto on an array of 1M integers each drawn from {0, 1, 2}.
Implement introsort (quicksort + depth limit + heapsort fallback + insertion-sort cutoff) and verify it handles a constructed quicksort-killer input without performance collapse.
Production scenario. You are building a library-sort wrapper for a financial back-office reconciliation job that must meet a 30-second SLA over 50M trades/day. Argue whether quicksort with randomized pivots, introsort, or Timsort is the right default; what regression would you write to detect a replacement sort that silently changed stability behavior and broke multi-key (ticker, timestamp) ordering?

Read This Only If Stuck

CLRS: 7.1 Description of Quicksort
CLRS: 7.2 Performance of Quicksort
CLRS: 7.3 Randomized Quicksort
CLRS: 7.4 Analysis of Quicksort
Skiena: 4.6 Quicksort -- Sorting by Randomization
Skiena: 4.6.2 Randomized Algorithms
Sedgewick: Quicksort
Sedgewick: Quicksort (tuning)
Princeton Sedgewick: 2.3 Quicksort -- 3-way partitioning, killer inputs, and the entropy-optimal bound for duplicate keys.
MIT 6.006 Lecture 4: Heaps and Heap Sort -- contrast with heapsort as the worst-case backup inside introsort.
VisuAlgo: Sorting (quicksort animation with pivot choice toggle) -- watch partition step-by-step and switch pivot strategies mid-trace.
Sedgewick: Quicksort is Optimal (talk, 2002) -- the entropy-optimal analysis of 3-way quicksort on inputs with many duplicate keys.
pdqsort paper (Peters 2021) -- pattern-defeating quicksort, the algorithm behind Rust sort_unstable and much of libstdc++.
cp-algorithms: Sorting and randomization primer -- contest-grade quicksort with the random-pivot pattern spelled out.
McIlroy: A Killer Adversary for Quicksort (1999) -- the paper that built deterministic (\Theta(n^2)) inputs against a pre-randomization qsort.
Bentley & McIlroy: Engineering a Sort Function (1993) -- the canonical quicksort-tuning paper that shaped every modern C library.

What This Concept Is​

Why It Matters Here​

Concrete Examples​

Common Confusion / Misconceptions​

How To Use It​

Transfer / Where This Shows Up Later​

Check Yourself​

Mini Drill or Application​

Read This Only If Stuck​