Skip to main content

Module 4: Operational Readiness & Security Review: Mistake Clinic

This clinic turns wrong moves into reusable judgment. Use it after each practice page and again before the quiz or checkpoint.


Module-Specific Mistake Radar

Start with these traps. Replace or extend them with real mistakes from your own work.

Mistake to look forWhere it shows upSymptomRepair evidence
Finishing SLO and Alert Lab with only a final answerSLO and Alert LabThe work has no failed case, trace, test, proof gap, or design stress point.Add the smallest broken example and show the repair that changes the result.
Finishing Observability Instrumentation Workshop with only a final answerObservability Instrumentation WorkshopThe work has no failed case, trace, test, proof gap, or design stress point.Add the smallest broken example and show the repair that changes the result.
Finishing Threat Model and Security Clinic with only a final answerThreat Model and Security ClinicThe work has no failed case, trace, test, proof gap, or design stress point.Add the smallest broken example and show the repair that changes the result.
Finishing Operational Katas with only a final answerOperational KatasThe work has no failed case, trace, test, proof gap, or design stress point.Add the smallest broken example and show the repair that changes the result.
Treating Writing One Real SLI and SLO for Your Capstone as vocabulary instead of a toolWriting One Real SLI and SLO for Your CapstoneThe explanation names the concept but cannot decide between two cases.Write one example, one non-example, and the rule that separates them.
Treating Error Budget for a Capstone: Small but Real as vocabulary instead of a toolError Budget for a Capstone: Small but RealThe explanation names the concept but cannot decide between two cases.Write one example, one non-example, and the rule that separates them.

Practice Mistake Checks

Pull any miss from these checks into your mistake log.

SLO and Alert Lab

Source: practice/01-slo-and-alert-lab.md

For each statement, identify the error:

  1. "Our SLO is 100% -- anything less and users complain."
  2. "We alert if error rate exceeds 1%; that's our SLO alert."
  3. "Budget is fine; we're at 78% consumed with 3 days left in the window."
  4. "We used the AmazonFreeTierMetrics default for our SLO target."
  5. "The alert fires on CPU > 90% because that's when things get slow."

Observability Instrumentation Workshop

Source: practice/02-o11y-instrumentation-workshop.md

For each statement, identify the error:

  1. "We log everything, just in case."
  2. "Our dashboard has 34 panels; all of them matter."
  3. "We sample 100% of traces; disk is cheap."
  4. "Span name is db.query(SELECT * FROM users WHERE id=42) so we know exactly which query."
  5. "We don't need trace_id in logs; the timestamps are enough to correlate."

Threat Model and Security Clinic

Source: practice/03-threat-model-and-security-clinic.md

  1. "We ran STRIDE but couldn't find gaps, so we're fine."
  2. "Our .env has all the secrets; Git won't commit it by accident."
  3. "AmazonS3FullAccess is easier; we'll tighten it later."
  4. "We enabled RDS backups, so we're backup-ready."
  5. "CI uses Action: *, Resource: * because it deploys a lot of services."

Repair Protocol

For each real mistake:

  1. Reproduce the failure on the smallest example, trace, proof, query, command, or design sketch.
  2. Name the hidden assumption.
  3. Repair the artifact.
  4. Save evidence that changed: failing then passing test, corrected proof step, revised diagram, safer command, benchmark, or review note.
  5. Add one retrieval card beginning with Check... before... or Do not use... when....

Mistake Log

DateMistakeSymptomRoot causeRepair evidenceRetrieval card
StarterPick one radar row aboveExplain how it would fail in this moduleName the assumptionAdd a counterexample or corrected artifactWrite the card before closing the page

Completion Standard

  • At least five real mistakes are logged.
  • At least two mistakes include a counterexample or failing test.
  • At least one mistake connects to an older semester skill.
  • At least one correction changes code, a proof, a diagram, a command transcript, a query, or a design decision.