Skip to main content

Exercises

Four lanes of book-scoped work. Each lane pulls from CLRS or a supporting text and targets a different mental muscle: proofs, implementation, pragmatics, contest-shaped problems. Aim for at least one lane per week during the module.

Lane 1 - CLRS Proof Drills

Target outcome: produce rigorous, short proofs of canonical balanced-tree and amortized-analysis properties.

Chunks:

Exercises:

  1. Prove that any red-black tree on n internal nodes has height at most 2 log2(n+1). Use the black-height invariant directly.
  2. Show that a left rotation preserves the BST property. Identify the three parent pointers that must update.
  3. Prove that the dynamic array with doubling has O(1) amortized append by (a) aggregate method, (b) accounting method ($3 per append), and (c) potential method with Phi(T) = 2 * size - capacity. Reconcile the three answers.
  4. Prove that without path compression but with union by rank, a sequence of m operations on n elements costs O(m log n). Then state and justify the O(m alpha(n)) bound once path compression is added.
  5. Use the potential method to show the amortized cost of a binary counter increment (over n increments starting at 0) is O(1). State your Phi.

Completion: hand-write each proof in under one page; self-grade for unstated assumptions.

Lane 2 - Implementation from Book

Target outcome: translate book pseudocode into working, tested code; discover where implementation diverges from theory.

Chunks:

Exercises:

  1. Implement red-black insert from CLRS 13.3. Add an invariant-checker that runs after every call and fails loudly on a violation. Feed 10,000 random insertions.
  2. Implement the disjoint-set forest with both union by rank and path compression from CLRS 19.3. Add an instrumentation counter for the number of pointer writes in find; plot against log* n on a logarithmic x-axis.
  3. Implement a segment tree with lazy propagation from Competitive Programming 2.4.3. Support range sum and range add. Cross-check against a brute-force array on 10^4 random sequences.
  4. Implement a Fenwick tree from Competitive Programming 2.4.4. Add one: a 2-D variant supporting rectangle-sum query.
  5. Compare measured runtimes of red-black insert vs skip list insert on n = 10^6 keys; discuss where the constants come from.

Completion: each implementation runs, passes randomized cross-checks, and has a short (two-paragraph) lab notebook.

Lane 3 - Skiena Pragmatics

Target outcome: practice picking the right structure for a real engineering constraint.

Chunks:

Exercises:

  1. Your service processes 10,000 request/sec of high-cardinality string keys. Describe the memory footprint and throughput trade-off between a hash set, a red-black tree, a Bloom filter, and a trie. Pick one and justify.
  2. You must compute an MST on a graph with V = 10^6 vertices and E = 5 * 10^6 weighted edges. Pick Kruskal with union-find vs Prim with a Fibonacci heap. Justify by asymptotic and constant-factor reasoning.
  3. Design a "user saw this recently" cache for a social feed, budget 128 MB. You will accept some false positives. Size a Bloom filter; describe what happens at the 1-year mark when n has grown 10x.
  4. Your analytics pipeline answers range-sum queries on a constantly mutating integer series of length 10^8. Select a data structure. Justify segment tree vs Fenwick vs prefix-sum array.

Completion: each answer fits in 300 words and references a concrete bound.

Lane 4 - Contest / Engineering Problem Set

Target outcome: apply the module's primary structures to an end-to-end problem under time pressure.

Chunks:

Exercises:

  1. "Dynamic connectivity (incremental)." Input: n <= 10^5 nodes and q <= 10^5 operations each of form union(u, v) or connected?(u, v). Implement with union-find + path compression. Time target: under 200 ms in your chosen language.
  2. "Range sum with point update." Input: array a[1..n] with n <= 10^6 and q <= 10^6 mixed operations. Implement a Fenwick tree. Time target: under 500 ms.
  3. "Range add, range sum." Input: array a[1..n] with n, q <= 10^5. Implement segment tree with lazy propagation.
  4. "Offline MST on dense graphs." Input: V <= 2000, complete weighted graph. Implement Kruskal with union-find. Verify weight against Prim-with-heap output.
  5. "Deduplicate a 100 GB log stream on a 128 MB budget." Implement a Bloom filter-based dedup with measured false-positive rate and justified m, k.

Completion: each problem solved, timed, and writeup includes the chosen asymptotic bound and measured wall-clock time.

Completion Checklist

  • all four lanes have at least three exercises completed
  • Lane 1 proofs are hand-verified for at least three items
  • Lane 2 implementations run on randomized cross-checks without failures
  • Lane 3 decisions cite concrete bounds, not hand-waves
  • Lane 4 problems meet stated time budgets; at least one is benchmarked against a reference implementation
  • notebook entries exist for each completed exercise (date, time-to-solve, main bug)