Learning Resources
This module is populated from the local chunked books in library/raw/semester-05-os-networking/books. Use this page as a source map, not as an instruction to read everything.
Source Stack
| Book | Role | How to use it in this module |
|---|---|---|
| Operating Systems: Three Easy Pieces (OSTEP), persistence chapters | Primary teaching source | Default escalation for every FS concept; the clearest operational explanations of inodes, layout, journaling, LFS, and disks |
| Operating System Concepts (Silberschatz) | Selective support | Use for the I/O subsystem (drivers, DMA), VFS framing, and consistency-checking perspective |
| Unix Network Programming (Stevens) | Selective support | Use for select, poll, and advanced polling; also the canonical I/O models framing |
| kernel.org documentation | Current Linux truth | Use for ext4, block layer, io_uring, and anything the textbooks predate |
| LWN.net | Evolution and nuance | Use for fsync failure modes, COW vs fsync interactions, and io_uring history |
Resource Map by Cluster
Cluster 1: The File Abstraction
| Need | Best local chunk | Why |
|---|---|---|
| what a file and directory are | OSTEP: Files and directories | Operational introduction to the abstraction |
| reading and writing files | OSTEP: Reading and writing files | How syscalls touch the inode |
| non-sequential I/O | OSTEP: Reading and writing but not sequentially | lseek and offsets |
| directory structure | OSTEP: Directory organization | Directory as a special kind of file |
| making directories | OSTEP: Making directories | mkdir trace |
| hard links | OSTEP: Hard links | Link count semantics |
| symbolic links | OSTEP: Symbolic links | Path-based indirection |
| the inode data structure | OSTEP: Aside - the inode data structure | Layout and pointer scheme |
| access paths | OSTEP: Access paths - reading and writing | End-to-end trace through kernel tables |
| VFS framing | OS Concepts: Virtual file systems | How Linux unifies many FS types |
Cluster 2: On-Disk Layout and Structure
| Need | Best local chunk | Why |
|---|---|---|
| disk interface | OSTEP: The interface | The block device contract |
| disk model | OSTEP: A simple disk drive | Platters, sectors, LBAs |
| I/O timing math | OSTEP: I/O time - doing the math | Quantitative model |
| disk scheduling | OSTEP: Disk scheduling | SSTF, C-SCAN, and the elevator |
| FS design philosophy | OSTEP: The way to think | Framing for layout decisions |
| overall layout | OSTEP: Overall organization | Canonical block arrangement |
| FFS - why block groups | OSTEP: The problem - poor performance | Motivation for locality |
| cylinder groups | OSTEP: Organizing structure - the cylinder group | Grouping related data |
| allocation policy | OSTEP: Policies - how to allocate files and directories | Where new data goes |
| locality measurement | OSTEP: Measuring file locality | Real measurements |
| large files | OSTEP: The large file exception | Why large files break naive locality |
| LFS design | OSTEP: Writing to disk sequentially | The sequential-only idea |
| LFS buffering | OSTEP: Writing sequentially and effectively | Segment buffering |
| LFS indirection | OSTEP: Solution through indirection - the inode map | Finding latest inodes |
| LFS garbage collection | OSTEP: A new problem - garbage collection | Cost of log-structured |
| ext4 format reference | ext4 documentation (kernel.org) | Authoritative on-disk layout |
Cluster 3: Crash Consistency
| Need | Best local chunk | Why |
|---|---|---|
| worked crash example | OSTEP: A detailed example | The canonical four-case analysis |
| fsck | OSTEP: Solution 1 - the file system checker | Pre-journal recovery |
| data journaling | OSTEP: Data journaling | Commit ordering and transaction model |
| forcing writes | OSTEP: Aside - forcing writes to disk | Barriers and FLUSH |
| log optimization | OSTEP: Aside - optimizing log writes | Reducing journal cost |
| log optimization (part 2) | OSTEP: Optimizing log writes - part 2 | Continued |
| other approaches | OSTEP: Solution 3 - other approaches | Soft updates, optimistic crash consistency |
| COW framing | OSTEP: Copy-on-write mappings | COW pattern in general |
| ZFS | OSTEP: ZFS | COW FS in production |
| journaling FS appendix | OSTEP: Journaling file system | Compact summary |
| consistency checking | OS Concepts: Consistency checking | FSCK-style perspective |
| disk failure modes | OSTEP: Disk failure modes | Beyond power loss |
| checksums | OSTEP: Detecting corruption - the checksum | Silent corruption |
| misdirected writes | OSTEP: Misdirected writes | Why just a checksum isn't enough |
Cluster 4: Caching and Performance
| Need | Best local chunk | Why |
|---|---|---|
| caching and buffering | OSTEP: Caching and buffering | Page and buffer cache |
| cache management | OSTEP: Cache management (paging) | Replacement policies that carry over |
| forcing writes | OSTEP: Forcing writes to disk | fsync barrier discussion |
| sequentiality | OSTEP: I/O time - doing the math | Quantitative gap |
| measuring locality | OSTEP: Measuring file locality | Real workload data |
| Linux performance | Brendan Gregg: Linux performance tools | Current practical reference |
| fsync failures | LWN: PostgreSQL's fsync() surprise | Real-world gotcha |
| fsync error handling | LWN: Improved fsync error handling | Evolution of the semantics |
Cluster 5: I/O Models and the Syscall Path
| Need | Best local chunk | Why |
|---|---|---|
| I/O models framing | UNP: I/O models | Canonical five-model taxonomy |
select | UNP: select function (part 1) | Classical multiplexing |
select continued | UNP: select (part 2) | Usage details |
select worked | UNP: select (part 3) | Echo server example |
pselect | UNP: pselect function | Signal-safe variant |
| advanced polling | UNP: Advanced polling (part 1) | poll and extensions |
| advanced polling 2 | UNP: Advanced polling (part 2) | Continued |
| device architecture | OSTEP: System architecture | How devices attach |
| device protocol | OSTEP: The canonical protocol | Command/status/interrupt |
| DMA | OSTEP: More efficient data movement with DMA | Why CPU is not the bottleneck |
| drivers | OSTEP: Fitting into the OS - the device driver | Where drivers sit |
| IDE case study | OSTEP: Case study - a simple IDE disk driver | Worked simple driver |
| epoll man page | epoll(7) man page | Authoritative semantics |
| io_uring primer | Lord of the io_uring (unixism.net) | Best hands-on intro |
| io_uring paper | Jens Axboe: Efficient IO with io_uring (PDF) | Canonical design document |
| io_uring evolution | LWN: rapid growth of io_uring | Historical arc |
External Resources (Read-If-Curious)
Only use these if you want a second exposition or a supplementary problem source. The module is completable from the local chunks alone.
- kernel.org: Filesystems documentation - ext4, xfs, btrfs, f2fs specs.
- OpenZFS documentation - practical and design-level ZFS reference.
- Btrfs documentation - Btrfs on-disk format and operations.
- Brendan Gregg: Linux performance tools - diagnostic tool catalog for I/O and storage.
- Brendan Gregg: Disk / storage latency - latency anatomy and tracing.
- Dan Kegel: The C10K problem - historical context that motivated
epoll. - LWN: The page cache - deep dive into the Linux page cache.
- fio documentation - the benchmarking tool used in Practice 3.
- man7.org syscall index - definitive syscall reference.
Use Rules
- If you are stuck on any FS layout or crash scenario, go to OSTEP first. It is the best operational reference for persistence.
- If you need the I/O models taxonomy or socket-style readiness patterns, go to UNP.
- If you need current Linux specifics (ext4 format, io_uring, fsync corner cases), go to kernel.org and LWN.
- Open one chunk for one concept gap; do not wander through a whole appendix sequence.
- If rereading does not fix the problem, stop and draw the on-disk structures in your own hand before reading more.