Searching and Selection Workshop
Retrieval Prompts
- State the loop invariant for binary search on a closed interval
[lo, hi]. - State the invariant change needed to compute
lower_boundinstead of equality. - Define predicate-based binary search and name one condition the predicate must satisfy.
- Describe the recurrence for expected-time quickselect.
- What does median-of-medians guarantee that random quickselect does not?
- When is "min-heap of size K" better than
sorted(data)[:K]for top-K, and when is it worse?
Compare and Distinguish
- binary search (equality) vs
lower_boundvsupper_bound - "search on a sorted array" vs "binary search on the answer" (predicate over a value range)
- quickselect vs median-of-medians
- top-K via heap of size K vs top-K via
select+ partition - linear search in an unsorted array vs hash-table lookup
Common Mistake Check
For each statement, identify the error:
- "Binary search on a linked list is
O(log n)." - "
mid = (lo + hi) / 2is always safe." - "Quickselect is worst-case
O(n)because the recurrence isT(n) = T(3n/4) + n." - "To find the median of
nelements you must first sort them all." - "A predicate-based binary search works on any boolean function of the position."
- "
std::nth_elementgives you a fully sorted array."
Mini Application -- Invariant-First Coding
Before writing any code for any of these, declare the region convention (closed or half-open), write the invariant in one sentence, and only then code.
binary_search(A, target) -> index | -1: classical equality search.lower_bound(A, x) -> index: smallestiwithA[i] >= x. If no suchi, returnlen(A).upper_bound(A, x) -> index: smallestiwithA[i] > x.first_occurrence(A, x)andlast_occurrence(A, x): two one-liners using the two bounds above.- Predicate search: given a monotone non-decreasing boolean function
P(k)onk in [lo, hi], return the smallestkwithP(k) = true.
Mini Application -- Selection Problems
-
Implement
quickselect(A, k)with a random pivot. Test that it returns the same value assorted(A)[k-1]over 1000 random trials. -
Solve: given an array of size
nand a valuek, return theksmallest elements. Do it three ways:- sort, slice
- quickselect for the
k-th, then partition - min-heap of size
kscanning the stream Measure the running times forn = 10^6and variousk.
-
Solve: given a matrix sorted rowwise and columnwise, return the
k-th smallest element inO(k log min(k, n))using a heap.
Mini Application -- Binary Search on the Answer
-
Given
nhouses at integer positions andmmailboxes to place, minimize the maximum distance from any house to its nearest mailbox. Frame this as predicate-based binary search. -
Given
njobs with durations andkworkers, each worker processes a contiguous batch. Minimize the maximum worker load. Define the predicate and argue its monotonicity. -
Given a sorted array, find the square root of an integer
xusing only integer binary search.
Evidence Check
This page is complete only if you can:
- write binary search and its two bound variants without looking at templates
- explain what a predicate must satisfy to support binary search on the answer
- pick among quickselect, heap-of-K, and full-sort for a specific top-K task and justify
Integrated List and Search Drill
Implement and test each operation without using a library shortcut that hides the algorithmic work:
- Remove all elements that satisfy a predicate while preserving order.
- Merge two sorted lists into a new sorted list.
- Return whether any two values sum to a target using a hash-backed lookup.
- Reverse a list in place.
- Rotate a list by
kpositions. - Implement linear search and binary search with the same return convention.
Evidence check:
- unit tests for empty, one-item, duplicate, and already-sorted cases
- a short note comparing when linear search beats binary search in practice despite worse asymptotic growth
- one benchmark or hand count showing the cost of sorting before searching many times