Module Quiz
Complete this quiz after finishing all concept and practice pages. Score yourself honestly: a "right answer" here means you can explain it in sentences, not just circle the correct letter.
Current Module Questions
Question 1: Insertion Sort Invariant
State the loop invariant for insertion sort at the start of the outer iteration over index i.
Answer: Before the outer iteration at index i, A[0..i-1] contains the elements originally in A[0..i-1], but in sorted order.
Solution Walkthrough:
- Initialization: at
i = 1,A[0..0]is trivially sorted. - Maintenance: the inner loop slides
A[i]leftward past larger elements and places it, soA[0..i]is sorted using the same elements. - Termination: after the loop,
i = n, andA[0..n-1]is sorted -- which is the goal.
Question 2: Mergesort Recurrence
Write the recurrence for mergesort and solve it asymptotically.
Answer: T(n) = 2 T(n/2) + Theta(n), which solves to T(n) = Theta(n log n).
Solution Walkthrough:
- Divide step: split the array into halves --
O(1)ignoring the allocation cost. - Conquer step: two recursive calls on halves,
2 T(n/2). - Combine step: merge is linear in the combined size,
Theta(n). - By the master theorem (
a = 2, b = 2, f(n) = Theta(n), log_b a = 1) we are in case 2, givingTheta(n log n).
Question 3: Quicksort Worst Case
What input pattern makes classical quicksort (pivot = last element) take Theta(n^2), and why does random pivoting fix it in expectation?
Answer: An already-sorted (or reverse-sorted) array makes each partition put n-1 elements on one side and 0 on the other, so the recurrence degenerates to T(n) = T(n-1) + Theta(n) = Theta(n^2). With a random pivot, the expected split is balanced in expectation, giving expected Theta(n log n).
Question 4: Comparison-Sort Lower Bound
Using a decision-tree argument, prove any comparison-based sort requires Omega(n log n) comparisons in the worst case.
Answer: A decision tree for any comparison-based sort must have at least n! leaves (one per possible output permutation). A binary tree with L leaves has height >= ceil(log2 L). So the worst-case height -- number of comparisons -- is >= log2(n!) = Theta(n log n) by Stirling's approximation.
Solution Walkthrough:
- Any comparison-based sort's execution is a root-to-leaf path in its decision tree.
- Different permutations must lead to different leaves (else some permutation is mis-sorted).
n!permutations require>= n!leaves.- Height of a binary tree with
>= n!leaves is>= log2(n!). log2(n!) = n log2 n - n log2 e + O(log n) = Theta(n log n).
Question 5: Counting Sort and Stability
Why is counting sort stable, and where does the stability actually get established?
Answer: Stability is established by the final pass that walks the input array left to right, placing each element at the prefix-sum cursor for its key and incrementing the cursor. Because equal-keyed elements are processed in their original order, they keep their original relative order in the output.
Question 6: Radix Sort Requirement
Why does LSD radix sort require its inner sorting pass to be stable?
Answer: LSD radix processes digits from least significant to most significant. When the pass on a higher-order digit is performed, it must preserve the relative order established by earlier passes on lower-order digits. If the inner sort is not stable, the algorithm can reorder keys that agreed on the new digit but had been correctly ordered by earlier digits, breaking correctness.
Question 7: Binary Search Invariant
For binary_search(A, target) with a closed interval [lo, hi], state the invariant and explain why updating hi = mid - 1 preserves it when A[mid] > target.
Answer: Invariant: if target is in A, it is in A[lo..hi]. When A[mid] > target, positions mid..hi cannot contain target (all values there are >= A[mid] > target by sortedness), so shrinking to A[lo..mid-1] still contains target if it is in A.
Question 8: Quickselect Expected Time
Why is randomized quickselect O(n) in expectation even though its worst case is O(n^2)?
Answer: With a uniform random pivot, the expected rank of the pivot makes each side have expected size at most 3n/4, so E[T(n)] <= T(3n/4) + cn. This recurrence solves to E[T(n)] = O(n) by the geometric sum of level costs.
Solution Walkthrough:
- A random pivot has at least a
1/2probability of being in the middle half of ranks. - In that case, the recursive call is on at most
3n/4elements. - Therefore
E[T(n)] <= T(3n/4) + cnin the dominant case. - Unrolling:
T(n) <= cn + c(3n/4) + c(3n/4)^2 + ... = cn / (1 - 3/4) = O(n).
Question 9: Hash Table Expected vs Worst Case
Why is the O(1) bound on hash-table operations expected rather than worst case?
Answer: The bound relies on the assumption that keys spread uniformly across buckets. If all keys collide in one bucket (adversarial input, bad hash, or high load), operations degrade to O(n). Hashing is a randomized model whose guarantee depends on the hash function and keys cooperating.
Question 10: Chaining vs Open Addressing
At load factor alpha = 0.9, which collision-resolution scheme degrades faster -- chaining or open addressing? Why?
Answer: Open addressing degrades much faster. Expected successful-search cost in chaining is 1 + alpha = 1.9. Expected successful-search cost in uniform-probing open addressing is roughly (1/alpha) ln(1/(1-alpha)) ≈ (1/0.9) ln(10) ≈ 2.56, and unsuccessful search is worse, 1/(1-alpha) = 10. For linear probing, the unsuccessful cost is (1/2)(1 + 1/(1-alpha)^2) = 50.5, which is catastrophic.
Question 11: Universal Hashing
Define a universal family of hash functions and give one example.
Answer: A family H is universal if for any two distinct keys x != y, P_{h in H}(h(x) = h(y)) <= 1/m where m is the table size. Example: h_{a,b}(x) = ((a*x + b) mod p) mod m where p is a prime larger than the key universe and a in [1, p-1], b in [0, p-1] are chosen uniformly at random.
Question 12: Binary Heap Build Cost
Why does build_heap run in O(n) rather than O(n log n) even though each sift_down takes O(log n)?
Answer: Nodes at depth h from the leaves sift down at most h levels. There are n / 2^{h+1} nodes at that depth. Total work is sum_{h=0}^{log n} (n / 2^{h+1}) * h = n * sum h / 2^{h+1}, which converges to O(n) (the series is geometric, not harmonic).
Solution Walkthrough:
- Leaves (bottom level) do zero work:
~n/2nodes withh = 0. - Next level does at most 1 sift step:
n/4nodes withh = 1. - The geometric series
sum h / 2^hconverges to a constant. - Total:
O(n).
Question 13: Heap vs BST
What is the fundamental operational difference between a binary heap and a balanced BST?
Answer: A heap only orders parent-child pairs (root is the extreme), so it supports peek, insert, extract_min in O(log n) but search for an arbitrary key is O(n). A BST orders every subtree, supporting search, insert, delete, and ordered traversal all in O(log n), but at the cost of more pointer chasing and a more complex implementation.
Question 14: Priority Queue Use Cases
Name three problem patterns that are naturally solved with a priority queue and explain one in detail.
Answer: Event-driven simulation, Dijkstra shortest path, top-K.
Detail (Dijkstra): maintain a min-heap of (tentative_distance, vertex). Repeatedly extract the closest unsettled vertex, mark it settled, and relax its outgoing edges -- for each neighbor, if dist[u] + w(u, v) < dist[v], update dist[v] and push or decrease_key in the PQ. Correctness follows from the invariant that when a vertex is extracted, its tentative distance equals its true shortest-path distance. Cost with a binary heap: O((V + E) log V).
Question 15: Choosing a Sort
You must sort 10 million 64-bit integer IDs in memory, with no requirement on stability. Which sort do you pick and why?
Answer: LSD radix sort with 8-bit digits (8 passes) is usually the fastest option. It is linear in n for fixed-width integers, cache-friendly, and avoids comparison branching. If a library sort is already well-tuned (introsort in C++, Timsort in Python), it is often good enough, but a custom radix can be 2-4x faster on large uniform integer data.
Interleaved Review Questions
Prior Module Question 1 (S2M1: Asymptotics)
State the master theorem case for T(n) = 2 T(n/2) + Theta(n log n).
Answer: This is case 2 (extended) -- the work at each level is Theta(n log n), which matches f(n) exceeding the regular case-2 form. A careful analysis gives T(n) = Theta(n log^2 n) via the recursion tree.
Prior Module Question 2 (S2M1: Correctness)
What are the three parts of an induction-based loop-invariant proof?
Answer: Initialization (invariant holds before the first iteration), maintenance (if it holds before an iteration, it holds after), and termination (when the loop exits, the invariant implies the desired property).
Prior Module Question 3 (S1M3: Probability)
Using linearity of expectation, derive the expected number of empty buckets when n keys hash uniformly into m buckets.
Answer: Let X_j be the indicator that bucket j is empty. P(X_j = 1) = (1 - 1/m)^n. By linearity, E[number empty] = m * (1 - 1/m)^n.
Prior Module Question 4 (S1M3: Probability)
Why does linearity of expectation apply to dependent indicators too?
Answer: Linearity of expectation does not require independence. E[X + Y] = E[X] + E[Y] for any random variables. Variance formulas care about dependence; sums of expectations do not.
Prior Module Question 5 (S1M2: Combinatorics)
How many distinct comparison sequences can sort n distinct elements, and how does that relate to the sorting lower bound?
Answer: There are n! possible orderings of the input, so a correct comparison-based sort must emit one of n! distinguishable execution paths. Since each comparison yields one bit of information, at least log2(n!) = Theta(n log n) comparisons are required in the worst case.
Self-Assessment and Remediation
Mastery Level (90-100% correct):
- Ready to advance to Module 3 (Graph Algorithms).
Proficient Level (75-89% correct):
- Review concept pages matching your missed questions. Redo two problems per missed concept.
Developing Level (60-74% correct):
- Rework the practice pages, especially Cluster 1 (sorting analysis) and Cluster 4 (hashing).
- Re-attempt the code katas that correspond to weak areas.
Insufficient Level (<60% correct):
- Return to the concept sequence in order. Do not advance until you can write mergesort, quicksort, binary search, and a binary heap from scratch without looking at references. If the issue is asymptotic-analysis mechanics (recurrences, decision trees), revisit Semester 2 Module 1 before continuing.