CI/CD Katas

Focused, repeatable drills. Complete each kata end-to-end. Then do it again from scratch until the shape is automatic.

Kata 1: GitHub Actions Workflow with OIDC Deploy to Staging

Time limit: 20 minutes (after first time)
Goal: build fluency authoring a workflow that builds, tests, and deploys to staging using OIDC -- no static cloud keys.
Setup: a sample repo with a simple service (any language) and an AWS or GCP target account you can configure an OIDC trust policy on.

Implementation checklist:

One workflow file .github/workflows/deploy-staging.yml.
Triggers: push to main and workflow_dispatch.
permissions: block -- id-token: write for OIDC, contents: read.
Jobs:
- test -- lint and unit tests.
- build -- build and push a container image tagged by commit SHA to GHCR.
- deploy-staging -- needs: build; uses aws-actions/configure-aws-credentials with role-to-assume; deploys the image.
The AWS IAM role's trust policy restricts assumption by repo:<owner>/<repo>:ref:refs/heads/main.
No AWS_ACCESS_KEY_ID anywhere in the workflow or repo secrets.
Smoke test after deploy -- curl the staging health endpoint, fail the job on non-200.

Repeat until: you can author this workflow from memory in under 20 minutes, it validates on first push, and the deploy succeeds with a freshly created OIDC trust policy.

Kata 2: Design a Canary Rollout for One Service

Time limit: 30 minutes
Goal: fluency with rollout design and rollback criteria.
Setup: pick a single service (real or one from workshop 02).

Produce a one-page plan containing:

Strategy: canary. State why (not rolling, not blue-green) for this service.
Traffic progression: exact weights and wait times (e.g., 5% -> 10m -> 25% -> 15m -> 50% -> 30m -> 100%).
Success metrics: concrete queries, not vibes. At least: success rate, p95 latency. If relevant: business metric (conversion, queue depth).
Rollback trigger: threshold + duration for each success metric.
Rollback action: exact command or tool invocation; tested.
Rollback owner: named role (e.g., on-call), no approval chain.
Rollback deadline: end-to-end seconds to minutes.
Observability: the deploy marker, the canary dashboard link, the alert route.

Deliverable: a Markdown doc + a working Argo Rollouts or Flagger spec (YAML) that implements it.

Repeat until: you can author both the doc and the spec in under 30 minutes for a new service description, and a teammate can critique it with fewer than 3 findings.

Kata 3: Expand/Contract Migration Paired with Backward-Compatible Code

Time limit: 45 minutes
Goal: operational fluency with DB schema changes that do not require downtime.
Setup: pick any non-trivial column change. Recommended: split a full_name column into first_name + last_name.

Produce four separately shippable PRs:

PR 1 -- Expand. Migration that ADD COLUMN first_name, ADD COLUMN last_name. Code: dual-write on save, continue reading full_name.
PR 2 -- Backfill. Migration to populate first_name / last_name for existing rows. Code unchanged.
PR 3 -- Switch reads. Code change: read first_name / last_name, fall back to parsing full_name if null. No schema change.
PR 4 -- Contract. Stop writing full_name. Later PR: ALTER TABLE ... DROP COLUMN full_name.

For each PR, state:

the rollback path from that deployed state
the minimum time window between this deploy and the next
what metric you would watch to confirm it is safe to proceed

Repeat until: you can sketch the four PRs and the transitions from memory in < 45 minutes, and you can name at least one alternative change type (add nullable column, change column type, split row into two tables) and adapt the pattern.

Kata 4: Write a Release Note Generator Spec

Time limit: 30 minutes
Goal: formalize release-note generation rules so they can be automated.
Setup: a repo using Conventional Commits (or pick one to retrofit).

Produce a specification document covering:

Input. Range of commits (vA..vB). Optional: PR metadata from the forge API.
Grouping rules. How commit types map to changelog sections:
- feat: -> Added
- fix: -> Fixed
- refactor: / perf: -> optional Changed
- docs: / chore: -> excluded by default
- BREAKING CHANGE: or feat!: -> Changed plus a ### Breaking Changes subsection
Version bump logic. How the presence of feat: vs fix: vs BREAKING CHANGE: determines MAJOR/MINOR/PATCH.
Commit-to-entry transform. Rules for turning a commit subject into a bullet: strip type prefix, capitalize, include PR link if available, group by scope if present.
Output. Full Markdown block ready to drop into CHANGELOG.md; separate human-facing summary for release notes.
Edge cases. Reverts, merges of merges, squash-merge scopes, Dependabot PRs.
Example. Run the spec against a real past release manually; compare with what was actually published.

Repeat until: two teammates, given the spec and the same commit range, produce identical changelog Markdown without consulting each other.

Completion Standard

Each kata completed end-to-end, in order, at least once.
Kata 1 and 2 repeated until within the time limit.
Kata 3 validated against a real production-like dataset (row counts on staging, not unit-test fixtures).
Kata 4 spec used to generate at least one real release note.

Kata 1: GitHub Actions Workflow with OIDC Deploy to Staging​

Kata 2: Design a Canary Rollout for One Service​

Kata 3: Expand/Contract Migration Paired with Backward-Compatible Code​

Kata 4: Write a Release Note Generator Spec​

Completion Standard​

Kata 1: GitHub Actions Workflow with OIDC Deploy to Staging

Kata 2: Design a Canary Rollout for One Service

Kata 3: Expand/Contract Migration Paired with Backward-Compatible Code

Kata 4: Write a Release Note Generator Spec

Completion Standard