Module 4: File Systems & I/O: Case Studies

These case studies make persistence honest: file descriptors, buffering, fsync, journaling, page cache, readiness, and event loops.

Case Study 1: `write` Returned But Data Was Lost

Scenario: An app writes a critical config file and crashes after write returns. After reboot, the file is empty or old.

Source anchor: fsync(2) documents flushing modified in-core file data to stable storage.

Module concepts: write buffer, page cache, durability, fsync, rename.

Wrong Approach

"write returning means data is on disk."

Better Approach

Use an atomic file update pattern:

write temp file
fsync temp file
rename temp -> target
fsync containing directory

Tradeoff Table

Choice	Gain	Cost
write in place	simple code	corruption risk on crash
temp file + fsync + rename	strong durability	more I/O
defer sync entirely	low latency	data loss window

Failure Mode

The application updates page-cache state but crashes before data and metadata reach durable storage, leaving an empty, partial, or stale file after reboot.

Project / Capstone Connection

Use this for config writers, grade exports, or any capstone workflow that rewrites important files and must survive power loss cleanly.

Required Artifact

Draw the crash points and state what exists after reboot.

Case Study 2: `select` Falls Over With Many Sockets

Scenario: A server watches 50,000 connections with select. CPU rises even when little traffic arrives.

Source anchor: epoll(7) explains scalable readiness notification through interest and ready lists.

Module concepts: select, poll, epoll, readiness, event loop.

Wrong Approach

"All readiness APIs scale the same."

Better Approach

Use an event-loop model:

interest list:
  file descriptors to watch

ready list:
  descriptors ready for I/O

loop:
  wait, drain, update interest

Tradeoff Table

Choice	Gain	Cost
`select`	widely known	poor scaling at high fd counts
`poll`	simpler fd-set handling	still scans all fds
`epoll`	efficient for many idle sockets	Linux-specific complexity

Failure Mode

The server repeatedly scans large descriptor sets even when little is ready, so CPU climbs with connection count rather than useful work.

Project / Capstone Connection

This fits chat servers, websocket backends, or proxy capstones that need to hold many mostly-idle connections efficiently.

Required Artifact

Compare select, poll, and epoll for 50,000 mostly-idle sockets.

Case Study 3: Page Cache Makes Benchmark Lie

Scenario: A file-read benchmark is extremely fast on the second run. The learner concludes the disk is fast.

Source anchor: Linux filesystem behavior and page cache explain cached reads. See Linux page cache documentation where available, plus module readings.

Module concepts: page cache, cold cache, warm cache, benchmarking.

Wrong Approach

Benchmark only warm-cache reads.

Better Approach

State cache condition:

cold run:
  includes storage I/O

warm run:
  measures memory/page-cache path

production:
  estimate cache hit ratio

Tradeoff Table

Choice	Gain	Cost
warm-cache-only benchmark	easy repeatability	misleading storage claims
cold and warm runs	fuller picture	harder setup
production trace correlation	realistic interpretation	more measurement work

Failure Mode

The second run measures memory-resident page-cache behavior, but the learner reports it as disk throughput and reaches the wrong system conclusion.

Project / Capstone Connection

Use this when presenting benchmark results for backup tools, media pipelines, or data-ingest capstones that depend on storage behavior.

Required Artifact

Write a benchmark report with cold/warm runs, cache condition, and interpretation.

Case Study 4: File Descriptor Leak

Scenario: A server opens files/sockets and forgets to close some error paths. Eventually EMFILE appears.

Source anchor: Linux open(2) and close(2) man pages describe file descriptors and lifecycle. See open(2) and close(2).

Module concepts: file descriptor, open-file table, resource leak, limits.

Wrong Approach

"Memory is the only leak that matters."

Better Approach

Track fd ownership:

open point:
  who owns fd?

transfer:
  does ownership move?

close:
  all success/error paths

Tradeoff Table

Choice	Gain	Cost
implicit fd ownership	quick coding	leak-prone error paths
explicit owner per fd	clearer cleanup	more discipline
RAII/helper wrapper	safer lifecycle	abstraction overhead

Failure Mode

Open descriptors survive exceptional paths and retries until the process hits the per-process fd limit and new opens fail with EMFILE.

Project / Capstone Connection

This belongs in servers, crawlers, or pipeline capstones that open many files and sockets under mixed success and failure paths.

Required Artifact

Write an fd ownership checklist and leak reproduction.

Case Study 5: Synchronous Logging Blocks Request Path

Scenario: Every request writes and flushes a log line synchronously. Tail latency follows disk latency.

Source anchor: fsync(2) and I/O readiness docs show why persistence and request latency are coupled when flushed inline.

Module concepts: synchronous I/O, buffering, durability, latency.

Wrong Approach

Flush every log line on the request thread.

Better Approach

Separate durability class:

audit/security event:
  durable path required

debug/request log:
  buffered async path acceptable

Tradeoff Table

Choice	Gain	Cost
synchronous flush per request	strongest per-line durability	high latency
buffered async logging	fast request path	bounded log loss
split audit vs debug channels	aligned durability	added routing complexity

Failure Mode

Request latency inherits storage latency because the request thread blocks on every flush instead of handing off noncritical logs.

Project / Capstone Connection

Apply this when deciding how application, audit, and debug logs should flow through capstone services with different loss-tolerance requirements.

Required Artifact

Create a logging durability matrix: event type, loss tolerance, flush policy, backpressure behavior.

Source Map

Source	Use it for
fsync(2)	durability and flushing
epoll(7)	scalable readiness notification
open(2) and close(2)	file descriptor lifecycle

Completion Standard

At least three artifacts are completed.
At least one artifact includes crash points.
At least one artifact compares readiness APIs.

Case Study 1: write Returned But Data Was Lost​

Wrong Approach​

Better Approach​

Tradeoff Table​

Failure Mode​

Project / Capstone Connection​

Required Artifact​

Case Study 2: select Falls Over With Many Sockets​

Wrong Approach​

Better Approach​

Tradeoff Table​

Failure Mode​

Project / Capstone Connection​

Required Artifact​

Case Study 3: Page Cache Makes Benchmark Lie​

Wrong Approach​

Better Approach​

Tradeoff Table​

Failure Mode​

Project / Capstone Connection​

Required Artifact​

Case Study 4: File Descriptor Leak​

Wrong Approach​

Better Approach​

Tradeoff Table​

Failure Mode​

Project / Capstone Connection​

Required Artifact​

Case Study 5: Synchronous Logging Blocks Request Path​

Wrong Approach​

Better Approach​

Tradeoff Table​

Failure Mode​

Project / Capstone Connection​

Required Artifact​

Source Map​

Completion Standard​

Case Study 1: `write` Returned But Data Was Lost

Wrong Approach

Better Approach

Tradeoff Table

Failure Mode

Project / Capstone Connection

Required Artifact

Case Study 2: `select` Falls Over With Many Sockets

Wrong Approach

Better Approach

Tradeoff Table

Failure Mode

Project / Capstone Connection

Required Artifact

Case Study 3: Page Cache Makes Benchmark Lie

Wrong Approach

Better Approach

Tradeoff Table

Failure Mode

Project / Capstone Connection

Required Artifact

Case Study 4: File Descriptor Leak

Wrong Approach

Better Approach

Tradeoff Table

Failure Mode

Project / Capstone Connection

Required Artifact

Case Study 5: Synchronous Logging Blocks Request Path

Wrong Approach

Better Approach

Tradeoff Table

Failure Mode

Project / Capstone Connection

Required Artifact

Source Map

Completion Standard