Module 1: System Design Methodology: Mistake Clinic

This clinic turns wrong moves into reusable judgment. Use it after each practice page and again before the quiz or checkpoint.

Module-Specific Mistake Radar

Start with these traps. Replace or extend them with real mistakes from your own work.

Mistake to look for	Where it shows up	Symptom	Repair evidence
Finishing Estimation and Framing Lab with only a final answer	Estimation and Framing Lab	The work has no failed case, trace, test, proof gap, or design stress point.	Add the smallest broken example and show the repair that changes the result.
Finishing High-Level Design Workshop with only a final answer	High-Level Design Workshop	The work has no failed case, trace, test, proof gap, or design stress point.	Add the smallest broken example and show the repair that changes the result.
Finishing Stress Test Clinic with only a final answer	Stress Test Clinic	The work has no failed case, trace, test, proof gap, or design stress point.	Add the smallest broken example and show the repair that changes the result.
Finishing Design Interview Katas with only a final answer	Design Interview Katas	The work has no failed case, trace, test, proof gap, or design stress point.	Add the smallest broken example and show the repair that changes the result.
Treating Understanding the Requirements as vocabulary instead of a tool	Understanding the Requirements	The explanation names the concept but cannot decide between two cases.	Write one example, one non-example, and the rule that separates them.
Treating Back-of-Envelope Estimation as vocabulary instead of a tool	Back-of-Envelope Estimation	The explanation names the concept but cannot decide between two cases.	Write one example, one non-example, and the rule that separates them.

Pull any miss from these checks into your mistake log.

For each statement, identify the error:

"The system must support 1 billion users." (Missing what?)
"P99 latency under 200 ms." (Acceptable as stated, or incomplete?)
"We need a cache." (What makes this wrong as framing?)
"Read:write is 10:1, so we need three read replicas." (What hidden assumption is this making?)
"At 100× scale, we will shard." (Shard on what, why, and at what partition?)

Identify the flaw:

"We will use MongoDB because it's flexible." (What is missing from this sentence?)
"Every service gets its own Redis cache." (When is this right; when is it wrong?)
"Our database is the primary; we'll add a cache later." (What premature optimization is baked in, and which genuine debt is not?)
"This endpoint has strict consistency, so we cache it." (What specifically is wrong?)
"We put a CDN in front of everything." (What danger is lurking here?)

Identify the flaw:

"Auto-scaling handles 10×." (What is this statement avoiding?)
"Replicas eliminate the SPOF." (Give two counter-examples.)
"Our P99 is 50 ms, so we're fine." (What is missing?)
"We'll add a circuit breaker later." (What class of failures does that decision accept right now?)
"The cache is 99.999% available, so we can rely on it." (What does cache failure do to downstream origin load?)

For each real mistake:

Reproduce the failure on the smallest example, trace, proof, query, command, or design sketch.
Name the hidden assumption.
Repair the artifact.
Save evidence that changed: failing then passing test, corrected proof step, revised diagram, safer command, benchmark, or review note.
Add one retrieval card beginning with Check... before... or Do not use... when....

Date	Mistake	Symptom	Root cause	Repair evidence	Retrieval card
Starter	Pick one radar row above	Explain how it would fail in this module	Name the assumption	Add a counterexample or corrected artifact	Write the card before closing the page

At least five real mistakes are logged.
At least two mistakes include a counterexample or failing test.
At least one mistake connects to an older semester skill.
At least one correction changes code, a proof, a diagram, a command transcript, a query, or a design decision.