Skip to main content

Module 1: System Design Methodology: Mistake Clinic

This clinic turns wrong moves into reusable judgment. Use it after each practice page and again before the quiz or checkpoint.


Module-Specific Mistake Radar

Start with these traps. Replace or extend them with real mistakes from your own work.

Mistake to look forWhere it shows upSymptomRepair evidence
Finishing Estimation and Framing Lab with only a final answerEstimation and Framing LabThe work has no failed case, trace, test, proof gap, or design stress point.Add the smallest broken example and show the repair that changes the result.
Finishing High-Level Design Workshop with only a final answerHigh-Level Design WorkshopThe work has no failed case, trace, test, proof gap, or design stress point.Add the smallest broken example and show the repair that changes the result.
Finishing Stress Test Clinic with only a final answerStress Test ClinicThe work has no failed case, trace, test, proof gap, or design stress point.Add the smallest broken example and show the repair that changes the result.
Finishing Design Interview Katas with only a final answerDesign Interview KatasThe work has no failed case, trace, test, proof gap, or design stress point.Add the smallest broken example and show the repair that changes the result.
Treating Understanding the Requirements as vocabulary instead of a toolUnderstanding the RequirementsThe explanation names the concept but cannot decide between two cases.Write one example, one non-example, and the rule that separates them.
Treating Back-of-Envelope Estimation as vocabulary instead of a toolBack-of-Envelope EstimationThe explanation names the concept but cannot decide between two cases.Write one example, one non-example, and the rule that separates them.

Practice Mistake Checks

Pull any miss from these checks into your mistake log.

Estimation and Framing Lab

Source: practice/01-estimation-and-framing-lab.md

For each statement, identify the error:

  1. "The system must support 1 billion users." (Missing what?)
  2. "P99 latency under 200 ms." (Acceptable as stated, or incomplete?)
  3. "We need a cache." (What makes this wrong as framing?)
  4. "Read:write is 10:1, so we need three read replicas." (What hidden assumption is this making?)
  5. "At 100× scale, we will shard." (Shard on what, why, and at what partition?)

High-Level Design Workshop

Source: practice/02-high-level-design-workshop.md

Identify the flaw:

  1. "We will use MongoDB because it's flexible." (What is missing from this sentence?)
  2. "Every service gets its own Redis cache." (When is this right; when is it wrong?)
  3. "Our database is the primary; we'll add a cache later." (What premature optimization is baked in, and which genuine debt is not?)
  4. "This endpoint has strict consistency, so we cache it." (What specifically is wrong?)
  5. "We put a CDN in front of everything." (What danger is lurking here?)

Stress Test Clinic

Source: practice/03-stress-test-clinic.md

Identify the flaw:

  1. "Auto-scaling handles 10×." (What is this statement avoiding?)
  2. "Replicas eliminate the SPOF." (Give two counter-examples.)
  3. "Our P99 is 50 ms, so we're fine." (What is missing?)
  4. "We'll add a circuit breaker later." (What class of failures does that decision accept right now?)
  5. "The cache is 99.999% available, so we can rely on it." (What does cache failure do to downstream origin load?)

Repair Protocol

For each real mistake:

  1. Reproduce the failure on the smallest example, trace, proof, query, command, or design sketch.
  2. Name the hidden assumption.
  3. Repair the artifact.
  4. Save evidence that changed: failing then passing test, corrected proof step, revised diagram, safer command, benchmark, or review note.
  5. Add one retrieval card beginning with Check... before... or Do not use... when....

Mistake Log

DateMistakeSymptomRoot causeRepair evidenceRetrieval card
StarterPick one radar row aboveExplain how it would fail in this moduleName the assumptionAdd a counterexample or corrected artifactWrite the card before closing the page

Completion Standard

  • At least five real mistakes are logged.
  • At least two mistakes include a counterexample or failing test.
  • At least one mistake connects to an older semester skill.
  • At least one correction changes code, a proof, a diagram, a command transcript, a query, or a design decision.