Module 4: Operational Readiness & Security Review: Mistake Clinic

This clinic turns wrong moves into reusable judgment. Use it after each practice page and again before the quiz or checkpoint.

Module-Specific Mistake Radar

Start with these traps. Replace or extend them with real mistakes from your own work.

Mistake to look for	Where it shows up	Symptom	Repair evidence
Finishing SLO and Alert Lab with only a final answer	SLO and Alert Lab	The work has no failed case, trace, test, proof gap, or design stress point.	Add the smallest broken example and show the repair that changes the result.
Finishing Observability Instrumentation Workshop with only a final answer	Observability Instrumentation Workshop	The work has no failed case, trace, test, proof gap, or design stress point.	Add the smallest broken example and show the repair that changes the result.
Finishing Threat Model and Security Clinic with only a final answer	Threat Model and Security Clinic	The work has no failed case, trace, test, proof gap, or design stress point.	Add the smallest broken example and show the repair that changes the result.
Finishing Operational Katas with only a final answer	Operational Katas	The work has no failed case, trace, test, proof gap, or design stress point.	Add the smallest broken example and show the repair that changes the result.
Treating Writing One Real SLI and SLO for Your Capstone as vocabulary instead of a tool	Writing One Real SLI and SLO for Your Capstone	The explanation names the concept but cannot decide between two cases.	Write one example, one non-example, and the rule that separates them.
Treating Error Budget for a Capstone: Small but Real as vocabulary instead of a tool	Error Budget for a Capstone: Small but Real	The explanation names the concept but cannot decide between two cases.	Write one example, one non-example, and the rule that separates them.

Practice Mistake Checks

Pull any miss from these checks into your mistake log.

SLO and Alert Lab

Source: practice/01-slo-and-alert-lab.md

For each statement, identify the error:

"Our SLO is 100% -- anything less and users complain."
"We alert if error rate exceeds 1%; that's our SLO alert."
"Budget is fine; we're at 78% consumed with 3 days left in the window."
"We used the AmazonFreeTierMetrics default for our SLO target."
"The alert fires on CPU > 90% because that's when things get slow."

Observability Instrumentation Workshop

Source: practice/02-o11y-instrumentation-workshop.md

For each statement, identify the error:

"We log everything, just in case."
"Our dashboard has 34 panels; all of them matter."
"We sample 100% of traces; disk is cheap."
"Span name is db.query(SELECT * FROM users WHERE id=42) so we know exactly which query."
"We don't need trace_id in logs; the timestamps are enough to correlate."

Threat Model and Security Clinic

Source: practice/03-threat-model-and-security-clinic.md

"We ran STRIDE but couldn't find gaps, so we're fine."
"Our .env has all the secrets; Git won't commit it by accident."
"AmazonS3FullAccess is easier; we'll tighten it later."
"We enabled RDS backups, so we're backup-ready."
"CI uses Action: *, Resource: * because it deploys a lot of services."

Repair Protocol

For each real mistake:

Reproduce the failure on the smallest example, trace, proof, query, command, or design sketch.
Name the hidden assumption.
Repair the artifact.
Save evidence that changed: failing then passing test, corrected proof step, revised diagram, safer command, benchmark, or review note.
Add one retrieval card beginning with Check... before... or Do not use... when....

Mistake Log

Date	Mistake	Symptom	Root cause	Repair evidence	Retrieval card
Starter	Pick one radar row above	Explain how it would fail in this module	Name the assumption	Add a counterexample or corrected artifact	Write the card before closing the page

Completion Standard

At least five real mistakes are logged.
At least two mistakes include a counterexample or failing test.
At least one mistake connects to an older semester skill.
At least one correction changes code, a proof, a diagram, a command transcript, a query, or a design decision.

Module-Specific Mistake Radar​

Practice Mistake Checks​

SLO and Alert Lab​

Observability Instrumentation Workshop​

Threat Model and Security Clinic​

Repair Protocol​

Mistake Log​

Completion Standard​