Skip to main content

External Exercises

This module has no leetcode-style problems; the exercise is your own capstone. These lanes point to external reading-and-doing sets that build specific fluency when the concept pages are not enough. Work each lane against your system, not a toy.

How To Use This Page

  1. Finish the relevant concept page and the matching practice page first.
  2. Pick a lane whose output you are still uncomfortable producing from scratch.
  3. Do the lane with your capstone repo open. Deliverable is a commit to your capstone, not notes.
  4. Maintain a mistake log with tags such as wrong SLI granularity, alert noise, unstructured log, missing trace hop, STRIDE gap missed, over-permissive role, untested backup, runbook missing rollback.

Lane 1: SLOs, Error Budgets, and Alerts

Use this lane when your SLO is aspirational or your alerts are noisy.

Target outcomes:

  • one committed library/raw/slo.md
  • one committed library/raw/error-budget-policy.md
  • at least one multi-window burn-rate alert live in your monitoring tool
  • a list of at least three non-SLO page-level alerts you have demoted to ticket or deleted

Lane 2: Observability

Use this lane when you cannot reach the suspect span from an alert in under two minutes.

Target outcomes:

  • library/raw/logging.md with a named field schema
  • one commit that replaces at least five string logs with structured events
  • a capstone-live dashboard with three labeled rows answering the three questions
  • one real distributed trace stored and linkable by URL
  • library/raw/tracing.md with sampling policy and runbook-linking convention

Lane 3: Threat Model, Secrets, Supply Chain, Least Privilege

Use this lane when your security posture is "probably fine."

Target outcomes:

  • one committed STRIDE worksheet with a full walk on one gap
  • one committed library/raw/security-policy.md
  • CI step that fails on HIGH/CRITICAL dependency CVEs
  • at least one artifact carries signed build provenance
  • one IAM role diff committed, with the breakage-and-widening log

Lane 4: Failure Planning, Backup, Runbooks

Use this lane when "what happens when X fails?" returns vague answers.

Target outcomes:

  • library/raw/top-failures.md with three prioritized failures
  • library/raw/reliability-decisions.md per external dependency
  • library/raw/recovery.md with a dated restore-drill log
  • three runbooks in library/raw/runbooks/* using the five-section template
  • library/raw/on-call.md with coverage, page-vs-ticket rules, and a kill switch

Self-Curated Problem Set

Build a custom set around real incidents in your capstone's staging history:

  • 3 staging incidents in the last 60 days -- what was the first-seen symptom and what was the actual cause?
  • 3 near-miss deploys -- what caught them, and what alert would have caught them automatically?
  • 3 cloud bills that surprised you -- which came from observability, backup, or logging, and is the trade-off still worth it?

These become postmortems, test cases, and PRR yellows -- whichever fits best.

Completion Checklist

  • Completed at least one lane in full with artifacts committed
  • Logged at least 10 real mistakes and corrections in the mistake journal
  • Walked the 18-item PRR and signed or red-listed each item honestly
  • At least one peer has validated the top runbook and the SLO document