Code Katas: FUSE, fsync, epoll, Recovery

These are longer than the 15-minute katas from prior modules because the work is integrative: you are wiring up multiple concepts. Target 60-120 minutes per kata. Work one per day, not all at once.

Kata 1: Build a Small FUSE Filesystem

Skill targeted: inode abstraction, directory operations, syscall wiring.
Time limit: 2-4 hours.

Using libfuse (C) or python-fuse / pyfuse3 / Rust's fuser, implement a filesystem that:

Mounts at /tmp/mymount.
Stores data in memory (a dict[path] -> bytes is fine).
Implements: getattr, readdir, lookup, create, open, read, write, unlink, mkdir, rmdir.
Returns sensible stat records (sizes, timestamps, permissions).
Handles echo hello > /tmp/mymount/a and cat /tmp/mymount/a round-trip.

Bonus goals:

Support hard links (link) and verify that ln then unlink decrements a link count correctly.
Back it with a file on disk instead of a dict; now you have a real (tiny) FS.
Run ls, cp, cat, rm against your mount; handle the corner cases they surface.

Success criterion: you can bash -c "cd /tmp/mymount && touch a && ln a b && ls -li" and the output is correct.

Kata 2: Measure `fsync` Cost

Skill targeted: fsync semantics, write-back, batching.
Time limit: 1-2 hours.

Write a program that produces a table:

Mode	Records/sec	Amortized `fsync`s per record
no fsync	?	0
fsync per record	?	1
fsync per 10 records	?	0.1
fsync per 100 records	?	0.01
O_SYNC	?	~1
fdatasync per record	?	1

Record size: 4 KiB. Run at least 10,000 total records per mode. Run on at least one real device (not just tmpfs).

Then stress it:

After each mode, send SIGKILL to a fresh run at a random time.
Reopen the file and measure its length.
Quantify how much data is lost per mode.

Success criterion: a one-page write-up with your numbers and one sentence per row explaining why the number is what it is.

Kata 3: Write a Small Event Loop With `epoll`

Skill targeted: non-blocking I/O, readiness-based multiplexing.
Time limit: 2-3 hours.

Implement a TCP echo server in C (or Python with selectors.EpollSelector, or Rust with epoll directly) that:

Uses a single thread.
Uses epoll_create1, epoll_ctl, and epoll_wait.
Handles at least 1,000 concurrent connections.
Correctly handles: partial reads, partial writes (buffer and register EPOLLOUT), peer close (EPOLLRDHUP), and errors.
Echoes input back to the sender in-order.

Benchmark with nc, ab, or a custom driver that opens 1,000 connections and sends chunks.

Bonus goals:

Add edge-triggered mode (EPOLLET); verify you drain sockets correctly in a loop.
Rewrite with io_uring via liburing and compare CPU usage under 10k concurrent idle connections.

Success criterion: htop shows a single thread, ss -s shows your connection count, and echoed data is bit-for-bit correct under 1,000 parallel clients.

Kata 4: Demonstrate Crash Recovery With and Without Journaling

Skill targeted: crash consistency reasoning end-to-end.
Time limit: 2-3 hours.

This kata is simulated because real power-loss testing is destructive. Build a small in-memory "file system" model in Python (or any language) that:

Has a superblock, inode bitmap, inode table, data bitmap, and data blocks (each just a list).
Implements touch, append, unlink as sequences of explicit writes to these structures.
Has a log region for the "journaled" mode.
Can run an operation in two modes: naive (direct writes) and journaled (write to log, commit, then apply).

Then:

Pick an operation, e.g. append.
Run it step-by-step but allow the test to inject a "crash" (abort execution) between any two steps.
After crash, run recover(fs) which:
- in naive mode, runs a consistency check that detects orphan blocks, dangling pointers, wrong link counts;
- in journaled mode, reads the log and replays committed transactions.
Compare the final state: which modes recover correctly from which crash points?

Produce a table of (operation, crash point, mode, post-recovery correctness). It should match the analysis from Practice 2: Crash Consistency Clinic.

Bonus goals:

Add COW mode (write new blocks, flip root pointer atomically at the end).
Simulate drive cache reordering by shuffling pending writes before each crash injection.

Success criterion: your table demonstrates all three properties in your own output:

naive mode can corrupt on at least one scenario
journaled mode always recovers a consistent state
COW mode always presents either pre-operation or post-operation state

Completion Checklist

Kata 1: FUSE filesystem mounted, ls/cat/ln/rm round-trip
Kata 2: fsync table produced with six modes and a post-SIGKILL loss measurement
Kata 3: single-threaded epoll server handling 1k+ connections
Kata 4: crash-consistency simulator with naive vs journaled modes
Wrote up at least one surprising result or wrong hypothesis from the measurements

Kata 1: Build a Small FUSE Filesystem​

Kata 2: Measure fsync Cost​

Kata 3: Write a Small Event Loop With epoll​

Kata 4: Demonstrate Crash Recovery With and Without Journaling​

Completion Checklist​

Kata 1: Build a Small FUSE Filesystem

Kata 2: Measure `fsync` Cost

Kata 3: Write a Small Event Loop With `epoll`

Kata 4: Demonstrate Crash Recovery With and Without Journaling

Completion Checklist