Code Katas
Timed, repeatable coding drills. Complete each kata from scratch -- no starter code, no copy-pasting -- until it becomes automatic. Use your preferred language; Python is recommended for its low ceremony.
Kata 1: Mergesort from Scratch
Time limit: 20 minutes
Goal: Recurrence implementation fluency
Setup: Empty file. No imports beyond basic collections.
Implement mergesort(A: list[int]) -> list[int] that returns a new sorted list. Implement a merge(L, R) helper. Include at least three tests: empty list, reverse-sorted list of length 10, list with duplicates.
Repeat until: you can write mergesort + merge + tests in under 10 minutes, with correct handling of empty input and single-element input.
Kata 2: Randomized Quicksort with Small-Subarray Insertion
Time limit: 20 minutes
Goal: In-place partition and hybrid-sort tuning
Setup: Empty file. Allow random import.
Implement in-place quicksort(A, lo, hi) using Lomuto partition with a random pivot (swap a random index with hi before partitioning). When hi - lo <= 15, switch to insertion sort for that subarray. Include a test that shows it handles 10,000 random integers and 10,000 already-sorted integers without timeout.
Repeat until: you can write it in under 15 minutes with correct Lomuto partition from memory.
Kata 3: Heapsort
Time limit: 20 minutes
Goal: Implicit-tree index math
Setup: Empty file.
Implement heapsort(A) that sorts in place. Helpers: sift_down(A, i, size), build_max_heap(A). Do not use any heap library. Include a test verifying that the result equals sorted(A) on a randomized input.
Repeat until: you can derive the parent/child index formulas from memory and write heapsort without referring to a reference.
Kata 4: Binary Search Variants
Time limit: 20 minutes
Goal: Invariant-first binary search
Setup: Empty file.
Implement three functions on a sorted list A:
binary_search(A, x)-- return index ofx, or-1lower_bound(A, x)-- smallest indexiwithA[i] >= x(orlen(A))upper_bound(A, x)-- smallest indexiwithA[i] > x(orlen(A))
Before writing each, write a one-sentence comment stating the loop invariant. Test on: empty array, single element, all-equal array, target smaller than all, target larger than all.
Repeat until: all three pass on the first try, with correct invariants written before the code.
Kata 5: Hash Table from Scratch (Chaining)
Time limit: 25 minutes
Goal: Collision handling, resizing, and API design
Setup: Empty file. No dict, no collections.
Implement a class HashMap with:
__init__(capacity=8)put(key, value)get(key, default=None)remove(key)__len__()- doubling resize when load factor
n / m > 0.75 - chaining with Python lists per bucket
Test: insert 1000 (i, i*i) pairs, verify all lookups, remove half, verify size and lookups.
Repeat until: you can write the class + tests in under 20 minutes with correct resize and without leaking KeyError.
Kata 6: Quickselect and k-Smallest
Time limit: 20 minutes
Goal: Partition as a selection tool
Setup: Empty file.
Implement:
quickselect(A, k)-- return thek-th smallest element using in-place Lomuto partition with random pivot.k_smallest_heap(A, k)-- return theksmallest elements as a list, using a max-heap of sizek(heapqnegated, orheapq.nsmallest).
Compare on A = [random.randint(0, 10**6) for _ in range(10**6)], k = 100. Report wall-clock time for both.
Repeat until: you can implement both in under 15 minutes and articulate which is faster for this n, k pair.
Kata 7: Priority Queue with decrease_key
Time limit: 25 minutes
Goal: Heap + index map
Setup: Empty file. You may import heapq for the raw heap, but the index map is yours.
Implement IndexedPQ:
insert(key, priority)extract_min()returns(key, priority)decrease_key(key, new_priority)contains(key) -> bool
Under the hood, keep a list of (priority, key) as the heap, and a dict key -> index updated on every sift. Test by running Dijkstra on a small hand-coded weighted graph and verifying shortest-path distances.
Repeat until: you can write it correctly in under 20 minutes.
Kata 8: Median Maintenance
Time limit: 15 minutes
Goal: Two-heap invariant
Setup: Empty file. heapq allowed.
Implement MedianStream with add(x) and median(). Use a max-heap for the lower half (via negation) and a min-heap for the upper half. Maintain the invariant 0 <= |lower| - |upper| <= 1. Return the appropriate median depending on parity.
Test against a sorted-array reference implementation on a stream of 10,000 random numbers, verifying the median at every step.
Repeat until: you can write it in under 10 minutes with the invariant handling correctly on odd and even stream lengths.
Completion Standard
- All 8 katas completed within their time limits.
- Each kata repeated at least twice, with the second attempt faster than the first.
- No kata requires looking at a reference or previous solution.
- You can explain, in one sentence, the invariant or principle each kata drills.
- You can identify which kata corresponds to which concept-page primary idea.