I/O Performance Lab

Retrieval Prompts

State the sequential vs random read IOPS ratio you expect on: HDD, SATA SSD, NVMe Gen4.
Describe what the page cache is and how it relates to "free" memory.
State exactly what fsync guarantees and what close does not.
Describe read-ahead and name one workload where it hurts.
Describe write-back and name one failure it enables.

Compare and Distinguish

Separate these pairs:

throughput vs IOPS vs latency
sequential vs random, small-block vs large-block
page cache vs drive cache
fsync vs fdatasync vs sync_file_range
O_DIRECT vs buffered I/O

Common Mistake Check

Identify the error:

"My workload is fast because I fit in the page cache. So it will be fast in production."
"O_DIRECT is always faster for databases."
"Since NVMe has no seek penalty, random and sequential are equivalent."
"free shows little free memory, so I need to add RAM."
"I benchmarked with a warm cache, and the numbers match disk spec-sheet IOPS."
"I use sync_file_range, so my data is durable."
"Write-back means my writes are async and free."

Measurement Drills

Run each drill and record results. Explain each observation in one paragraph.

Drill 1: Cache effects

dd if=/dev/zero of=/tmp/big bs=1M count=1024
sync
echo 3 | sudo tee /proc/sys/vm/drop_caches
time cat /tmp/big > /dev/null     # cold
time cat /tmp/big > /dev/null     # warm

Report cold and warm times. Compute effective MB/s for each.

Drill 2: `fsync` cost

Write a small program:

Open /tmp/log for appending, O_CREAT|O_WRONLY|O_APPEND.
Loop 10,000 times: write(fd, buf, 4096). No fsync. Record time.
Repeat with fsync after each write. Record time.
Repeat with fsync after every 100 writes. Record time.
Compare with O_SYNC (equivalent to fsync per op in terms of durability).

Report throughput (ops/sec) and amortized cost per op. Explain the gap.

Drill 3: Sequential vs random

Using fio, run the following on an empty file (~1 GiB):

fio --name=seq-r --rw=read --bs=1M --size=1G --direct=0
fio --name=seq-w --rw=write --bs=1M --size=1G --direct=0
fio --name=rand-r --rw=randread --bs=4k --size=1G --direct=1 --iodepth=32
fio --name=rand-w --rw=randwrite --bs=4k --size=1G --direct=1 --iodepth=32

Tabulate IOPS, bandwidth, and average latency for each.

Drill 4: Read-ahead heuristic

dd a 4 GiB file of zeros.
Read it sequentially with O_DIRECT (blocking read-ahead).
Read it sequentially without O_DIRECT (read-ahead enabled).
Use posix_fadvise(POSIX_FADV_RANDOM) and read sequentially; compare.

Explain what read-ahead gives you and when it misfires.

Mini Application: Budget

You have an HDD with 100 random IOPS and 150 MiB/s sequential. You need to:

ingest 10,000 small records/sec, each ~512 B.
serve random read requests at 50 req/sec.

Design an on-disk layout that can handle the workload. Estimate disk utilization. Which data structures (write-back batching, group-commit WAL, LSM tree, B-tree)? Justify with numeric I/O budget.

Repeat the exercise for NVMe at 500 kIOPS random + 3 GiB/s sequential.

Evidence Check

This page is complete only if you can:

run the measurements above on your own system
explain surprises and non-surprises with reference to page cache, drive cache, and device characteristics
defend a storage design choice with an I/O budget

Retrieval Prompts​

Compare and Distinguish​

Common Mistake Check​

Measurement Drills​

Drill 1: Cache effects​

Drill 2: fsync cost​

Drill 3: Sequential vs random​

Drill 4: Read-ahead heuristic​

Mini Application: Budget​

Evidence Check​