Event-Driven Katas
Focused, repeatable drills designed to build fluency in event design, integration plumbing, workflow thinking, and CQRS decision-making. Complete each kata multiple times until the pattern feels automatic.
Kata 1: Design an Event for OrderPlaced
Time limit: 15 minutes
Goal: Produce a production-ready OrderPlaced event, end to end, under time pressure.
Setup: You are the backend engineer for a marketplace checkout service. A customer just placed an order.
Produce, in 15 minutes:
- Event payload (JSON) with
event_id,occurred_at,order_id,customer_id, line items, total in cents, currency,schema_version. - Topic name following an explicit convention.
- Two consumer services that will subscribe, and for each, state whether they prefer notification or ECST.
- One field you deliberately leave OFF the event and why (PII, size, volatility, secret).
- One non-obvious property you add to the event (e.g., correlation ID, causation ID, saga ID).
- The outbox DDL line for this event (just enough to store the payload).
Repeat until: You can produce all six artifacts in under 10 minutes without looking at Concept 01, 05, or 06. Try three different domains: an order, an insurance claim, a user sign-up.
Kata 2: Implement the Outbox Pattern Sketch
Time limit: 20 minutes Goal: Sketch a complete outbox implementation, DDL to relay. Setup: A service that does "write DB, then publish event." You have Postgres + Kafka.
Produce, in 20 minutes:
1. Schemas (DDL)
- primary entity table (
orders) outboxtable with the right columns and indexes- justify the index choice
2. Write transaction
Write the actual SQL (or Python + SQL) for place_order(order) that atomically inserts the order row and the outbox row.
3. Relay
Write the polling relay in 15-25 lines of pseudocode with:
SELECT ... FOR UPDATE SKIP LOCKED LIMIT N- publish to Kafka
UPDATE ... SET published_at = now()- sleep on empty result
4. Consumer dedup
On a consumer of the resulting topic, write the idempotent handler: check processed_events, insert if new, apply effect in one transaction.
5. Failure modes
List four failure modes and what happens in each:
- relay dies after
SELECTbut before publish - relay publishes but dies before
UPDATE - DB commits but process dies before return
- broker rejects the publish
Repeat until: You can produce the four artifacts in under 15 minutes. Run the kata twice: once with polling, once with Debezium/CDC tailing as the relay.
Kata 3: Choreography vs Orchestration for a Checkout
Time limit: 20 minutes
Goal: Model the same workflow twice, decide, and defend.
Setup: A marketplace checkout with 5 steps: ReserveStock, ChargeCard, CreateShipment, UpdateLoyaltyPoints, SendReceiptEmail.
Produce:
A. Choreography version
- draw (ASCII or mermaid) the event graph
- list every event name, producer, consumers
- identify where the workflow's state actually lives (trick: nowhere in particular)
- note two bugs that are easy to introduce ("ghost orders", "missed compensation")
B. Orchestration version
- draw the state machine
- list commands the orchestrator sends and replies it expects
- identify retry and timeout policy per step
- note two operational dependencies (orchestrator uptime, schema versioning)
C. Decision
- Pick one.
- Write a 3-sentence defense.
- Name one stakeholder who would push back and the counter-argument.
Repeat until: You can produce both versions in under 15 minutes and your decision is consistent across runs.
Variants to try across repeats:
- same workflow, different team size (3 engineers vs 30)
- same workflow, different reliability needs (internal tool vs regulated payments)
- a totally different workflow: onboarding a SaaS tenant (5 steps, some taking hours)
Kata 4: Decide CQRS Yes/No for 3 Scenarios
Time limit: 15 minutes Goal: Make three rapid CQRS decisions and defend each in 2-3 sentences.
For each scenario, decide: CQRS yes, no, or partial. Write the defense.
Scenario 4a -- Internal HR portal
5 engineers. 12 entities (employees, PTO, comp, etc.). 3 simple read screens, 1 reports page. Audit is nice-to-have.
Scenario 4b -- Trading platform
40 engineers on the platform team. Strong write-side invariants (no oversell, margin checks). Read sides: live position dashboard, real-time search, daily analytics pipeline, regulatory risk reports. Already have Kafka and an event store.
Scenario 4c -- Healthcare claims system
Regulatory audit required. Complex adjudication invariants. 3 read sides: provider dashboards, fraud detection, reimbursement analytics. Team is 15 engineers, with no prior event-sourcing experience.
Scenario 4d -- (bonus) Marketplace
Sellers upload listings (complex validation, ~hundreds of listings/sec). Buyers search (Elasticsearch), browse category pages (read-heavy), and a real-time "recently sold" feed. Team: 25 engineers, event-driven already in production.
Expected shape of answers:
- 4a: no -- CQRS is overkill. Use a DB and a reports view.
- 4b: yes (CQRS+ES) -- the classic fit.
- 4c: yes, but staged; recommend CQRS first with outbox + projections, event sourcing later if team grows into it.
- 4d: partial CQRS -- write side stays relational; projections for search and "recently sold." Event sourcing probably not worth it.
Repeat until: You make all three calls in under 10 minutes and can defend each without re-reading Concept 15.
Variants across repeats:
- swap the team sizes and see if your decision changes (it should)
- swap the read-count and see if your decision changes
- flip "audit required" on or off
Completion Standard
- You can complete Kata 1 in the time limit with an event that passes the Concept 01 checklist.
- You can produce a complete outbox sketch (DDL + txn + relay + dedup) in the time limit.
- You can draw both choreography and orchestration for the checkout workflow from memory.
- You can make and defend CQRS decisions for all four scenarios in the time limit.
- You have run each kata at least twice with different domains or constraints, and your answers were consistent in style.