Hashing and Priority-Queue Clinic
Retrieval Prompts
- State the simple-uniform-hashing assumption in one sentence.
- Write the inequality that defines a universal family of hash functions.
- Give the expected search cost for chaining at load factor
alpha. - Give the expected successful search cost for open addressing at load factor
alpha. - State the parent and child index formulas for a 0-indexed binary heap.
- Give the total cost of
build_heapand sketch why it isO(n), notO(n log n). - Name three priority-queue problem shapes beyond Dijkstra.
Compare and Distinguish
- chaining vs open addressing (performance profile, memory layout, deletion)
- universal hashing vs cryptographic hashing
- hash table vs balanced BST (strengths of each)
- binary heap vs binary search tree (what each preserves)
extract_minvspeekon a heapdecrease_keywith an index map vs "lazy" decrease-key with duplicates
Common Mistake Check
For each statement, identify the error:
- "A hash table is
O(1)worst case." - "Open addressing is always faster than chaining."
- "Deleting a key in linear probing just means marking the slot empty."
- "A binary heap is fully sorted in its array representation."
- "Heapsort is strictly faster than quicksort because of its better worst case."
- "Dijkstra's algorithm requires Fibonacci heaps to be efficient."
- "You can search for an arbitrary key in a heap in
O(log n)."
Mini Application -- Hash Tables
-
Implement a chaining hash table with resize-on-load-factor. Parameters: initial capacity
8, targetalpha = 0.75, doubling on resize. Test with insert, lookup, delete, and one resize trigger. -
Implement linear-probing hash table with tombstone-based deletion. Include a rehash at
alpha = 0.5(since tombstones count as occupied for probe-sequence correctness). -
Using linearity of expectation, compute:
- expected number of empty buckets when
n = 200,m = 256 - expected number of collisions (pairs of keys with
h(i) = h(j)) whenn = 200keys hash uniformly intom = 256buckets
- expected number of empty buckets when
-
Demonstrate a collision-attack scenario: using a trivially predictable hash
h(x) = x mod m, build 100 keys that all land in bucket 0 and show the resulting worst-case lookup time.
Mini Application -- Binary Heaps
-
Implement
sift_upandsift_downfor an array-backed min-heap. Test on adversarial inputs (sorted ascending, sorted descending, all duplicates). -
Implement
build_heapinO(n)by sifting down from indexn/2 - 1to0. Verify on[7, 4, 9, 2, 6, 8, 1, 5, 3]. -
Implement a priority queue with
decrease_key(handle, new_key)backed by a heap and ahandle -> indexdictionary. Test by running Dijkstra on a 10-node graph.
Mini Application -- PQ Patterns
Pick three of the following and implement:
- Event-driven simulation of a single-server queue (Poisson arrivals, exponential service). Report mean wait time.
- Top-K over a stream of 1 million floats using a min-heap of size K.
- Median maintenance over the same stream with a max-heap/min-heap split.
k-way merge of 10 sorted integer files usingheapq.merge.- Huffman code construction for a small alphabet with listed frequencies.
- Dijkstra shortest path on a 20-node weighted graph.
Analysis Drills
- Given load factor
alpha = 0.5in linear probing, how many probes does a successful search do on average? Unsuccessful search? - For a heap, derive the
O(n)build bound by summingsum_{h=0}^{log n} (n / 2^{h+1}) * h. - For Dijkstra using a binary heap on a graph with
Vvertices andEedges, show that the total PQ work isO((V + E) log V).
Evidence Check
This page is complete only if you can:
- explain expected
O(1)hashing in terms of a probability model and name what breaks it - implement chaining and open-addressing hash tables from scratch and explain the deletion difference
- write sift-up and sift-down without looking at templates
- recognize a priority-queue problem shape and implement it
Integrated Hashing and Heap Drill
Complete these small exercises before the larger katas:
- Insert 12 keys into 8 buckets using chaining. Draw the buckets and compute load factor.
- Repeat with linear probing. Show every probe for at least four insertions.
- Delete a key from the open-addressed table and explain why a tombstone may be needed.
- Build a min-heap from
[9, 4, 7, 1, 3, 6, 2]using bottom-up heapify. Show the array after each sift-down. - Use a priority queue to merge three sorted streams and write the runtime in terms of total items and number of streams.
Evidence check: include tests that force collisions and tests that verify heap order after every public operation.