Module 5: Distributed Systems Fundamentals: Mistake Clinic

This clinic turns wrong moves into reusable judgment. Use it after each practice page and again before the quiz or checkpoint.

Module-Specific Mistake Radar

Start with these traps. Replace or extend them with real mistakes from your own work.

Mistake to look for	Where it shows up	Symptom	Repair evidence
Finishing Time and Ordering Lab with only a final answer	Time and Ordering Lab	The work has no failed case, trace, test, proof gap, or design stress point.	Add the smallest broken example and show the repair that changes the result.
Finishing Failure Model Workshop with only a final answer	Failure Model Workshop	The work has no failed case, trace, test, proof gap, or design stress point.	Add the smallest broken example and show the repair that changes the result.
Finishing Consensus Reasoning Clinic with only a final answer	Consensus Reasoning Clinic	The work has no failed case, trace, test, proof gap, or design stress point.	Add the smallest broken example and show the repair that changes the result.
Finishing Distributed Systems Code Katas with only a final answer	Distributed Systems Code Katas	The work has no failed case, trace, test, proof gap, or design stress point.	Add the smallest broken example and show the repair that changes the result.
Treating The Eight Fallacies of Distributed Computing as vocabulary instead of a tool	The Eight Fallacies of Distributed Computing	The explanation names the concept but cannot decide between two cases.	Write one example, one non-example, and the rule that separates them.
Treating Partial Failure: The Single Defining Property as vocabulary instead of a tool	Partial Failure: The Single Defining Property	The explanation names the concept but cannot decide between two cases.	Write one example, one non-example, and the rule that separates them.

Practice Mistake Checks

Pull any miss from these checks into your mistake log.

Time and Ordering Lab

Source: practice/01-time-and-ordering-lab.md

For each, identify the error:

"Our servers run NTP so timestamp-based last-write-wins is safe."
"I used System.nanoTime() to stamp events and compared them across two services."
"Lamport timestamps L(a) = 3 and L(b) = 3 mean a and b are concurrent."
"Vector clocks give a total order over events."
"We resolve write conflicts by the timestamp with the largest value."

Failure Model Workshop

Source: practice/02-failure-model-workshop.md

For each statement, identify the error:

"Since we run a modern cloud, we don't have partial failures anymore."
"TCP told us the connection is broken, so the peer process is down."
"We need Byzantine fault tolerance because our nodes sometimes return weird data."
"Our heartbeat is every 10ms with a 30ms timeout, so we detect failures fast."
"A failure detector that always says 'all alive' is at least safe."

Consensus Reasoning Clinic

Source: practice/03-consensus-reasoning-clinic.md

For each, identify the error:

"Raft guarantees exactly-once client semantics."
"A 4-node Raft cluster is fine - it can tolerate one failure."
"If a Paxos proposer gets no reply, it should propose a new value with a lower proposal number."
"Under Raft, if the leader crashes, any follower can become the new leader."
"FLP says consensus is unsolvable, so Paxos and Raft are unsound."

Repair Protocol

For each real mistake:

Reproduce the failure on the smallest example, trace, proof, query, command, or design sketch.
Name the hidden assumption.
Repair the artifact.
Save evidence that changed: failing then passing test, corrected proof step, revised diagram, safer command, benchmark, or review note.
Add one retrieval card beginning with Check... before... or Do not use... when....

Mistake Log

Date	Mistake	Symptom	Root cause	Repair evidence	Retrieval card
Starter	Pick one radar row above	Explain how it would fail in this module	Name the assumption	Add a counterexample or corrected artifact	Write the card before closing the page

Completion Standard

At least five real mistakes are logged.
At least two mistakes include a counterexample or failing test.
At least one mistake connects to an older semester skill.
At least one correction changes code, a proof, a diagram, a command transcript, a query, or a design decision.

Module-Specific Mistake Radar​

Practice Mistake Checks​

Time and Ordering Lab​

Failure Model Workshop​

Consensus Reasoning Clinic​

Repair Protocol​

Mistake Log​

Completion Standard​