Hashing and Priority-Queue Clinic

Retrieval Prompts

State the simple-uniform-hashing assumption in one sentence.
Write the inequality that defines a universal family of hash functions.
Give the expected search cost for chaining at load factor alpha.
Give the expected successful search cost for open addressing at load factor alpha.
State the parent and child index formulas for a 0-indexed binary heap.
Give the total cost of build_heap and sketch why it is O(n), not O(n log n).
Name three priority-queue problem shapes beyond Dijkstra.

For each statement, identify the error:

Implement a chaining hash table with resize-on-load-factor. Parameters: initial capacity 8, target alpha = 0.75, doubling on resize. Test with insert, lookup, delete, and one resize trigger.
Implement linear-probing hash table with tombstone-based deletion. Include a rehash at alpha = 0.5 (since tombstones count as occupied for probe-sequence correctness).
Using linearity of expectation, compute:
- expected number of empty buckets when n = 200, m = 256
- expected number of collisions (pairs of keys with h(i) = h(j)) when n = 200 keys hash uniformly into m = 256 buckets
Demonstrate a collision-attack scenario: using a trivially predictable hash h(x) = x mod m, build 100 keys that all land in bucket 0 and show the resulting worst-case lookup time.

Implement sift_up and sift_down for an array-backed min-heap. Test on adversarial inputs (sorted ascending, sorted descending, all duplicates).
Implement build_heap in O(n) by sifting down from index n/2 - 1 to 0. Verify on [7, 4, 9, 2, 6, 8, 1, 5, 3].
Implement a priority queue with decrease_key(handle, new_key) backed by a heap and a handle -> index dictionary. Test by running Dijkstra on a 10-node graph.

Pick three of the following and implement:

Event-driven simulation of a single-server queue (Poisson arrivals, exponential service). Report mean wait time.
Top-K over a stream of 1 million floats using a min-heap of size K.
Median maintenance over the same stream with a max-heap/min-heap split.
k-way merge of 10 sorted integer files using heapq.merge.
Huffman code construction for a small alphabet with listed frequencies.
Dijkstra shortest path on a 20-node weighted graph.

Given load factor alpha = 0.5 in linear probing, how many probes does a successful search do on average? Unsuccessful search?
For a heap, derive the O(n) build bound by summing sum_{h=0}^{log n} (n / 2^{h+1}) * h.
For Dijkstra using a binary heap on a graph with V vertices and E edges, show that the total PQ work is O((V + E) log V).

This page is complete only if you can:

explain expected O(1) hashing in terms of a probability model and name what breaks it
implement chaining and open-addressing hash tables from scratch and explain the deletion difference
write sift-up and sift-down without looking at templates
recognize a priority-queue problem shape and implement it

Complete these small exercises before the larger katas:

Insert 12 keys into 8 buckets using chaining. Draw the buckets and compute load factor.
Repeat with linear probing. Show every probe for at least four insertions.
Delete a key from the open-addressed table and explain why a tombstone may be needed.
Build a min-heap from [9, 4, 7, 1, 3, 6, 2] using bottom-up heapify. Show the array after each sift-down.
Use a priority queue to merge three sorted streams and write the runtime in terms of total items and number of streams.

Evidence check: include tests that force collisions and tests that verify heap order after every public operation.