Skip to main content

Randomized and Range-Query Clinic

Retrieval Prompts

  1. Explain why a skip list's expected height is O(log n) without computing the constant.
  2. Define a treap's two invariants: BST on keys and heap on priorities.
  3. State the Bloom filter guarantee in both directions (positives and negatives).
  4. Write the formulas for optimal bit count m and hash count k given n and target false-positive rate p.
  5. Explain the role of the segment tree's lazy-propagation array.
  6. Explain how a Fenwick tree recovers prefix sums using the low-bit trick.

Compare and Distinguish

Separate these pairs clearly:

  • skip list vs treap (where the randomness lives: in the levels vs in the priorities)
  • Bloom filter vs hash set (why Bloom saves memory and what it gives up)
  • segment tree vs Fenwick tree (what operations each supports; where segment trees win)
  • point update with range query vs range update with point query (how Fenwick covers both cases with one trick)
  • path copying vs fat nodes (two ways to get persistence; which generalizes to arbitrary DAGs)

Common Mistake Check

For each statement, identify the error:

  1. "A Bloom filter can return a false negative if two elements hash to the same slot."
  2. "A segment tree uses exactly 2n nodes."
  3. "A Fenwick tree supports arbitrary range-min queries."
  4. "A persistent list must copy the entire list on every update."
  5. "Adding more hash functions to a Bloom filter always reduces the false-positive rate."

Mini Application

Part A - Bloom filter sizing

Design a Bloom filter for the following constraints, and show your work:

  1. n = 10^6 keys, target false-positive rate p = 0.01
  2. n = 10^8 keys, target false-positive rate p = 0.001
  3. n = 10^4 keys, bit budget fixed at m = 100,000 (find achievable p and optimal k)

Use m = -n * ln(p) / (ln 2)^2 and k = (m/n) * ln 2, rounding to sensible integers.

Part B - Segment tree derivation

Given an array of length n = 8 and the recursive segment-tree build, answer:

  1. how many leaves, internal nodes, and total nodes
  2. for a point update at index 3, which nodes get touched
  3. for a range query sum[2..6], which nodes' values are combined
  4. sketch the lazy-propagation path for a range update add 5 to [1..4]

Part C - Persistent structure

Given a persistent BST with versions v0, v1, v2 where:

  • v0 = empty
  • v1 = insert(5) into v0
  • v2 = insert(3) into v1

Draw the shared node graph. Count how many nodes are physically allocated across all three versions. Explain why full persistence costs O(log n) per operation rather than O(n).

Evidence Check

This page is complete only if you can:

  • size a Bloom filter from n and p without looking up the formula
  • trace point update, range query, and lazy range update on an 8-element segment tree by hand
  • compute lowbit(i) and explain Fenwick index traversal for query and update
  • describe path copying in a persistent BST and count shared nodes across two versions