Code Katas
Focused, repeatable drills for this module. Complete each one multiple times until automatic.
Format for every kata:
- Time limit: as stated
- Setup: noted per kata
- Evidence: what you must show at the end
Kata 1: Virtual Address Bit Split
Time limit: 10 minutes
Goal: Decompose virtual addresses into VPN + offset for several page sizes.
Setup: Paper and pencil only.
For each of these (virtual_address, page_size) pairs, compute the offset and VPN:
0x0000_7FFE_ABCD_1234, page size 4 KiB.- Same address, page size 2 MiB.
- Same address, page size 1 GiB.
0x0000_0000_0040_10C8, page size 4 KiB.0x0000_FFFF_FFFF_FFFF, page size 4 KiB.
Repeat until: you can do all five in under 5 minutes without errors.
Kata 2: Four-Level Page-Table Walk
Time limit: 15 minutes
Goal: Simulate an x86-64 4-level walk on paper.
Setup: A made-up (PML4, PDPT, PD, PT) with a few entries you pick.
Write:
- the five index fields (PML4 = 9 bits, PDPT = 9, PD = 9, PT = 9, offset = 12)
- the lookup steps, naming the physical address at each level
- the final physical address
Then repeat the walk with the second-level entry's "page size" bit set (2 MiB huge page). Stop at the right level.
Repeat until: you can walk a 4-level translation in under 5 minutes and correctly short-circuit huge pages without hesitation.
Kata 3: Minor vs Major Fault Trace
Time limit: 15 minutes
Goal: Predict and then observe fault counts.
Setup: A Linux machine with perf.
Write a 20-line C program that:
mmaps 256 MiB anonymous private.- Writes one byte per 4 KiB page (with a
msyncor barrier at the end). - Calls
msync/madvise(..., MADV_DONTNEED)on the region. - Writes one byte per page again.
Predict minor and major fault counts before running, then measure with perf stat -e minor-faults,major-faults.
Repeat until: your prediction is within 10% of measured.
Kata 4: Reference-String Simulator by Hand
Time limit: 15 minutes
Goal: Simulate replacement policies on paper.
Setup: One reference string and 3 frames.
Given 1 2 3 4 1 2 5 1 2 3 4 5:
- Trace FIFO, LRU, and OPT with 3 frames. Count faults.
- Redo with 4 frames. Note where FIFO shows anomalous behavior.
- Redo with clock (reference bits start at 0, set on access; evictions advance the hand).
Repeat until: you can produce correct fault counts for all policies on a new reference string within 5 minutes.
Kata 5: /proc/$pid/maps Reading
Time limit: 10 minutes
Goal: Read a real maps file cold and explain every region.
Setup: A running process; open /proc/$pid/maps.
For each mapping, identify:
- permissions (
r,w,x,p/s) - backing (file path,
[stack],[heap],[anon],[vvar],[vdso],[vsyscall]) - whether it is likely a libc region, a heap region, a stack region, an
mmaped file, etc.
Repeat until: you can classify at least 80% of a typical process's 30+ mappings in under 5 minutes without looking up syntax.
Kata 6: mmap Four Flavors
Time limit: 20 minutes
Goal: Write code and predict behavior for each of the four mmap flavors.
Setup: A C compiler and a small test file.
Write four tiny programs:
- Anonymous private 1 MiB, write a byte in every page, observe RSS.
- File-backed private read-only over a 1 MiB file, read every byte, observe RSS and
Shared_CleanvsPrivate_Cleanin/proc/$pid/smaps. - File-backed shared writable over a 1 MiB file, write a pattern, and verify with
catthat the file contents changed. - Anonymous shared via
memfd_create+mmap, fork, and confirm the parent's write is visible to the child.
Repeat until: you can write each program from scratch in under 3 minutes and explain /proc/$pid/smaps classification for each.
Kata 7: TLB Stride Benchmark (micro)
Time limit: 20 minutes
Goal: Write a very small program that exhibits a TLB-miss knee.
Setup: A Linux machine with perf.
Write a program that walks an N-entry uint64_t array at stride S, with N large enough to exceed L2 cache and L2 TLB reach. Time at least three stride values: within a page (64 bytes), one per page (4096 bytes), one per 2 MiB (2 << 20 bytes).
Report time-per-access and dTLB-load-misses per access for each stride.
Repeat until: you can see the TLB knee cleanly and explain in writing what changes between strides.
Kata 8: Paradigm Triage Timed Sprint
Time limit: 30 minutes total
Goal: Diagnose a symptom and name the Cluster it lives in.
Setup: Any eight symptoms from exercises.md or made-up from production stories.
For each symptom:
- tag it as TLB, page-fault, thrashing, allocator, fragmentation, CoW,
mmap, NUMA, or "other" - name the one measurement you would run first to confirm
- name the one-line fix you would try if confirmed
Target: 8 triages in 30 minutes.
Completion Standard
- All eight katas completed at least once.
- Katas 1, 2, 4, 6 completed three or more times, within time limit, from scratch.
- Programs from Katas 3, 6, 7 run cleanly on your machine with
perfmeasurements matching predictions. - You can classify a process's
/proc/$pid/mapsin under 5 minutes. - You can walk a 4-level translation in under 5 minutes from a blank piece of paper.