Skip to main content

Crash Consistency Clinic

Retrieval Prompts

  1. State from memory the three on-disk writes required to append a block under ext-style update-in-place.
  2. State the invariant a journaling FS relies on: "commit block last."
  3. Explain why ordered mode is preferred over writeback mode in ext4.
  4. State the COW invariant in one sentence: "old or new, never partial."
  5. Why is a directory fsync required after creating a new file?

Compare and Distinguish

Separate these pairs:

  • journaling vs copy-on-write
  • data=journal vs data=ordered vs data=writeback
  • fsck pass vs journal replay
  • fsync(file) vs fsync(dir) vs sync()
  • atomic rename vs direct overwrite

Common Mistake Check

Identify the error in each statement:

  1. "Journaling doubles all writes."
  2. "A COW FS cannot be corrupted."
  3. "fsync guarantees the drive has the data."
  4. "close implies fsync."
  5. "If the kernel issues writes in order, the disk commits them in order."
  6. "sync_file_range is a faster fsync."
  7. "Metadata journaling protects user data."

Crash Scenarios

For each scenario, draw the initial on-disk state, list the writes in order, and for every crash point (between each write) describe post-crash state under:

  • naive update-in-place (no journal)
  • ext4 data=ordered journaling
  • COW (Btrfs-style)

Scenarios:

  1. Append: extend a 4 KiB file to 8 KiB.
  2. Overwrite: replace bytes 0-99 of a 1 MiB file.
  3. Rename: rename("a.tmp", "a") within a directory.
  4. Unlink: rm a on a file with nlink = 1, open by no one.
  5. Unlink-open: rm a on a file with nlink = 1, open by one process.
  6. mkdir: create a new subdirectory in a non-full parent directory.

For each scenario you should be able to name the worst crash point and what is lost or corrupted.

Safe Write Patterns

Implement and verify each pattern. For each, identify the exact durability guarantee:

  1. Durable atomic file replacement
    • write(tmpfd, data); fsync(tmpfd); close(tmpfd); rename(tmp, target); fsync(dirfd)
  2. Safe append of a record to a log
    • write(fd, record); fsync(fd)
  3. Group commit
    • write(fd, records); fsync(fd) once per batch, N records per batch.
  4. Bad pattern to avoid
    • rename(new, target) without fsync on either the file or the directory.

For each, explain: what does a crash after step k recover to?

Mini Application: Recovery Log

Build a table:

ScenarioFSPost-crash stateRecoverable?Notes
Append crash between data write and commitext4 orderedData block written but not pointed toYes (journal ignores uncommitted)Safe
Append crash between data and metadata under data=writebackext4 writebackMetadata may point at garbageOnly partiallyStale data visible
Overwrite crash mid-sectoranyDrive-level: sector may be atomically-or-notDepends on hardwareDrive sector atomicity

Fill in at least 10 rows across the scenarios from the previous section.

Evidence Check

This page is complete only if you can:

  • trace any multi-block operation and identify every unsafe crash ordering
  • explain the role of commit blocks, barriers, and drive cache flushes
  • write safe code for durable rename and durable appends without reference