Code Katas
Five katas. Each targets one canonical systems-programming skill. For every kata:
- Read the specification.
- Predict the syscall sequence (which calls, in which order, with which arguments) before you open an editor.
- Implement it. Use
-Wall -Wextra -pthread(where relevant) and-g -O0for the first build. - Verify the behavior with the listed tool (
strace,ps,valgrind,gdb, or a direct comparison against the system utility). - Repeat from scratch until you can produce a working version in the time limit without reference.
Kata 1: Mini Shell with One Pipe
Time limit: 60 minutes
Targets: fork, execvp, pipe, dup2, waitpid.
Write a shell msh that:
- reads a line from stdin
- splits it on whitespace
- handles exactly one
|(e.g.,ls -l | wc -l) - runs each side in a child with stdin/stdout rewired through a single pipe
- closes both pipe ends in the parent before waiting
- loops until EOF
Verification: echo hello | ./msh with input cat | wc -c should print 6. Run ./msh under strace -f and confirm you see exactly one pipe2 (or pipe), two forks, two execves, four closes in the parent path, and two wait4s.
Stretch: support < file and > file redirection.
Kata 2: cat and wc From Raw Syscalls
Time limit: 45 minutes combined
Targets: open, read, write, close, EINTR handling, byte loops.
Write mycat and mywc:
mycat [FILE...]-- behave likecat; if no args, read stdin. Byte-exact output against systemcat.mywc [FILE...]-- printlines words bytes FILElikewc, and a total line when given multiple files.
Both must: loop read and write, handle EINTR, use a 4 KB buffer, and close every opened fd.
Verification: compare output with cat and wc on /etc/hostname, /etc/services, and any 1 MB text file (diff must be empty).
Stretch: make mycat support -n (line numbers) using only raw I/O.
Kata 3: Producer-Consumer with Condition Variables
Time limit: 45 minutes
Targets: pthread, pthread_mutex, pthread_cond, bounded buffer.
Write a program where:
- 2 producer threads each push integers 1..100000 into a shared bounded queue (capacity 16)
- 4 consumer threads pop and accumulate into a single shared total (protected by the same mutex, or an atomic counter)
- at shutdown, each producer pushes a sentinel per consumer
- the main thread joins all 6 threads and prints the final total
- the final total equals
2 * 100000 * (100000 + 1) / 2 = 10000100000
Annotate your source with a comment on every synchronization call stating which race it prevents.
Verification: run 50 times with ./pc | sort -u | wc -l -- should always be exactly 1 (one unique correct answer).
Kata 4: Tiny HTTP Server
Time limit: 60 minutes
Targets: socket, bind, listen, accept, recv, send, HTTP framing, short-write loops.
Write tinyhttp that:
- listens on port 8080 with
SO_REUSEADDR - accepts one connection at a time (single-threaded is fine)
- reads the HTTP request headers until it sees
\r\n\r\n - ignores the method and path and responds with:
HTTP/1.1 200 OK\r\n
Content-Type: text/plain\r\n
Content-Length: 14\r\n
\r\n
hello, world\n
Verification: curl -v http://localhost:8080/ returns 200 OK and hello, world. curl repeated many times never leaks file descriptors (ls /proc/$(pidof tinyhttp)/fd | wc -l is stable).
Stretch: add a fork-per-connection model with a SIGCHLD reaper. Serve two curls in parallel.
Kata 5: gdb Hunt on a Planted Bug
Time limit: 30 minutes
Targets: gdb breakpoints, watchpoints, core dumps.
Build and debug this program (name it bughunt.c):
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
typedef struct { int id; char name[16]; int score; } User;
static User *users;
static int count;
static void load(const char *names[], int n) {
users = calloc(n, sizeof *users);
count = n;
for (int i = 0; i < n; i++) {
users[i].id = i;
strcpy(users[i].name, names[i]); /* bug: no length check */
users[i].score = 0;
}
}
static void bump(int id, int delta) { users[id].score += delta; }
int main(void) {
const char *names[] = { "alice", "bob", "this-name-is-definitely-way-too-long", "dave" };
load(names, 4);
for (int i = 0; i < 4; i++) bump(i, 10);
printf("%s has score %d\n", users[2].name, users[2].score);
return 0;
}
Tasks:
- Build with
-g -O0, run, observe either a crash or wrongscorefor a later user. - Enable core dumps and open the core in
gdb. Produce a backtrace. - Set a watchpoint on
users[3].scoreat the start ofmainand run until it changes. Record the instruction address and the caller. - Explain which write corrupted what, in one sentence.
- Fix
loadby usingstrncpywithsizeof users[i].name - 1and an explicit null terminator. Rebuild. Confirm the watchpoint no longer triggers unexpectedly.
Repeat until: you can enter gdb, set a breakpoint, run, print locals, and produce a backtrace without looking anything up.
How to Score Yourself
- Kata 1: complete if
./mshrunsls | wc -landcat | wc -ccorrectly andstrace -fshows no leaked fds after a pipeline finishes. - Kata 2: complete if
diffagainst systemcat/wcis empty on three distinct inputs. - Kata 3: complete if the program produces the same correct total across 50 runs with no
helgrindwarnings (valgrind --tool=helgrind ./pc). - Kata 4: complete if
curlreports 200 OK andtinyhttpsurvives 1,000 sequential requests without leaking fds. - Kata 5: complete if you can reproduce, locate, and fix the overflow in under 30 minutes without asking anyone.