Sequential vs Random I/O Performance
What This Concept Is
Sequential I/O reads or writes contiguous blocks; random I/O scatters across unrelated blocks. The performance gap between the two is the single largest free variable in storage design.
Order-of-magnitude numbers (2025-ish, consumer-grade):
Device Sequential Random 4 KiB
(MiB/s) (IOPS)
---------------------------------------------------
7200 RPM HDD ~150 ~100
SATA SSD ~500 ~80,000
NVMe (Gen3) ~3,000 ~400,000
NVMe (Gen4) ~7,000 ~800,000
On HDD, the gap is three to four orders of magnitude (seek dominates). On NVMe, it is still one to two orders; sequential access amortizes DMA setup, PCIe overhead, and FTL remapping.
A second axis: block size. Larger requests amortize per-request overhead:
4 KiB sequential reads: IOPS-bound
128 KiB sequential reads: bandwidth-bound
4 KiB random reads: seek + FTL-bound
4 KiB random writes on SSD: garbage-collection-bound (worst)
Why It Matters Here
Every piece of FS design in this module is shaped by this gap:
- Block groups (FFS, ext4) place related inodes and data close together to approximate sequential access.
- LFS turns every write into a sequential log append.
- Write-back coalesces small random writes in the cache into fewer larger sequential writes at disk flush.
- Read-ahead trades a small speculative cost for large bandwidth on sequential patterns.
- Journaling places the journal in a sequential region so every commit is a sequential append, regardless of where the application's writes go.
For applications:
- Databases choose B-trees (random updates in place) or LSM-trees (sequential appends + background compaction) largely based on this trade-off.
- Message queues (Kafka) win by writing huge sequential log segments.
- Backup tools and fsck benefit from sequential scans even on random-write workloads.
Concrete Example
Copying 1 GiB on a 7200 RPM HDD:
- Sequential
dd bs=1M: ~7 seconds (150 MiB/s sustained). - Sequential
dd bs=4K: ~10 seconds (syscall overhead limits you below bandwidth). - Random 4 KiB reads (
fio): ~100 IOPS = 400 KiB/s. 1 GiB = 262,144 x 4 KiB = ~2,600 seconds ~ 44 minutes.
Same 1 GiB. 400-fold spread.
On NVMe Gen4:
- Sequential: ~0.15 seconds.
- Random 4 KiB: ~800 kIOPS = 3.3 GiB/s. 1 GiB ~ 0.3 seconds.
Still a 2x spread, but the scale is different.
Common Confusion / Misconception
"SSDs erased the gap." They closed it. They did not erase it. Random write on SSD is worse than random read because of GC and erase-block alignment. Cold random reads on NVMe are still perhaps 5x slower than sequential reads of the same total volume.
"My workload is sequential because I only append." Check with blktrace or iostat -x 1. Even append-only logic can become random if the cache is too small and the FS fragments, or if you interleave with other writers. Application sequential does not always survive layering.
"Small block = random, large block = sequential." Not quite. Block size and access pattern are independent axes. 4 KiB can be sequential (reading adjacent blocks 1, 2, 3, ...) or random (reading blocks 1000, 20, 50000, ...). Sequential 4 KiB often still underperforms sequential 128 KiB because you are hitting syscall and request overhead.
How To Use It
When sizing or redesigning a storage layer:
- Measure actual I/O pattern with
iostat -x(avg request size, await, %util) orblktrace/iosnoop. - Check whether sequential-looking workloads stay sequential through layers. A journaled FS over a RAID5 array over NVMe is three opportunities to fragment.
- Prefer sequential where possible: log-structured updates, large batched writes, separate log files from random data files.
- On HDD, fight tooth and nail for sequentiality. On NVMe, sequentiality still matters but less; parallelism and queue depth become first-class.
Check Yourself
- Why do database designers care about this gap even on all-flash systems?
- Why does a write-back cache narrow the gap for writes but not for reads?
- Why is a log-structured file system particularly friendly to SSDs (not just HDDs)?
Mini Drill or Application
Use fio to run four workloads on your system and record results:
- 4 KiB sequential read, queue depth 1.
- 128 KiB sequential read, queue depth 8.
- 4 KiB random read, queue depth 1.
- 4 KiB random read, queue depth 32.
For each, record IOPS, throughput, and average latency. Explain the four results in one paragraph each.