Module 3: Event-Driven Architecture

Primary texts: Fundamentals of Software Architecture (Richards & Ford, Ch. 14) and Designing Event-Driven Systems (Stopford) Selective support: System Design Primer (asynchronism, communication), Microservices Patterns (saga, outbox, CQRS), DDIA stream-processing chapters from Semester 6

This guide is the primary teacher. You do not need to read the source books front-to-back to complete this module. You do need to become fluent at reasoning about a system where events -- not requests -- are the unit of truth and coordination.

Scope of This Module

This module is not "how to run Kafka." It is where communication flips from synchronous requests to asynchronous facts, and where coordination stops being a remote procedure call and starts being a conversation between services about things that already happened.

What it covers in depth:

events as immutable facts about the past, distinguished from commands and queries
the mental shift from CRUD tables to event streams as the system of record
publish-subscribe vs point-to-point queues and when each makes sense
event notification vs event-carried state transfer and the consistency tradeoff
the outbox pattern and why database-plus-broker dual writes fail without it
queue-based brokers (JMS, AMQP, SQS) vs log-based brokers (Kafka, Pulsar)
consumer groups, partitions, ordering guarantees, and what "exactly once" really buys you
choreography vs orchestration for multi-service workflows
sagas with compensating transactions for long-running business processes
idempotency and deduplication as the consumer-side price of at-least-once delivery
event sourcing: the append-only log as the canonical record of state
projections and read models built by folding the log
CQRS and when separating the write and read sides actually helps

What it deliberately does not try to finish here:

operating a Kafka cluster in production (SRE-level tuning, multi-region replication)
every enterprise integration pattern (see Hohpe & Woolf for the full catalog)
stream-processing frameworks (Flink, Kafka Streams) at the implementation level
advanced event-modeling workshops (full domain storming)

This is a reasoning module, not a syntax module. If you can publish to Kafka but cannot explain why a command disguised as an event will rot your system, you are not done.

Before You Start

Answer these closed-book before starting the main path:

What is the difference between "the user clicked Buy" and "charge the user's card"?
Why does at-least-once delivery force consumers to be idempotent?
If two services need the same data, what are two fundamentally different ways to keep them in sync?
Why is writing to a database and then publishing to a broker in the same function a bug waiting to happen?
What does a saga give you that a 2PC transaction across services does not?

Diagnostic Interpretation

4-5 solid answers

You are ready for the full path.

2-3 solid answers

Continue, but expect extra time in Clusters 2 and 4 (outbox and saga).

0-1 solid answers

Revisit Semester 7 Module 2 (Cluster 3: event-driven topologies) and Semester 6 Module 3 (replication and log-based thinking) before continuing.

What This Module Is For

Event-driven architecture is the communication style that actually scales across team boundaries, not just machines. It shows up any time the real question is:

multiple systems care about the same thing happening -- who tells them?
a workflow crosses service boundaries and no one service owns it end-to-end
the write path and the read path have wildly different performance needs
you need an audit trail that is not a bolted-on log
your transactions are longer-running than any single request can hold open

This module builds the reasoning you need for:

designing microservice boundaries that do not degenerate into distributed monoliths
choosing a messaging substrate (queue vs log) for the real workload
writing services that survive duplicates, reordering, and replays
explaining eventual consistency to product managers without hand-waving
deciding when event sourcing and CQRS earn their complexity -- and when they do not

You are learning to design for facts, not for calls.

Concept Map

How To Use This Module

Work in order. Later clusters only make sense if the earlier vocabulary is stable.

Cluster 1: Events as a Mental Model

Order	Concept	Type	Focus
1	An Event Is an Immutable Fact About the Past	PRIMARY	The one-sentence definition, past tense, named by what happened
2	Events vs Commands vs Requests	PRIMARY	Three message intents and why mixing them rots systems
3	The Shift from CRUD to Events	PRIMARY	Why "last write wins" is a modeling choice, not a law

Cluster mastery check: Can you rename three "update" operations in a CRUD system as past-tense events without losing information?

Cluster 2: Messaging Patterns

Order	Concept	Type	Focus
4	Publish-Subscribe vs Point-to-Point Queues	PRIMARY	Broadcast vs work distribution; one consumer vs many
5	Event Notification vs Event-Carried State Transfer	PRIMARY	Thin events that trigger lookups vs fat events that carry payload
6	The Outbox Pattern: Atomically Publishing Events	PRIMARY	Killing the dual-write bug with a transactional outbox

Cluster mastery check: Can you pick pub-sub vs queue for a given scenario and defend it, and can you explain why publishing inside a DB transaction is a different failure mode than publishing after commit?

Cluster 3: Brokers and Log-Based Systems

Order	Concept	Type	Focus
7	Queue Semantics: JMS, AMQP, SQS	PRIMARY	Classical brokers, delivery acknowledgements, and DLQs
8	Log-Based Brokers: Kafka's Design and Retention	PRIMARY	Immutable partitioned log, offsets, retention as a design tool
9	Consumer Groups, Partitions, Ordering Guarantees	PRIMARY	Per-partition ordering, rebalancing, and key-based routing

Cluster mastery check: Can you draw a Kafka topic with three partitions and two consumers in a group, and explain exactly which consumer gets which messages and why?

Cluster 4: Distributed Workflow with Events

Order	Concept	Type	Focus
10	Choreography vs Orchestration	PRIMARY	Who drives the workflow: each service, or a central coordinator?
11	Sagas: Long-Running Transactions Across Services	PRIMARY	Compensating transactions instead of distributed commit
12	Idempotency, Deduplication, and the Exactly-Once Illusion	PRIMARY	Why "at-least-once + idempotent" is the only honest story

Cluster mastery check: Can you walk a checkout saga through both a happy path and a failed-payment compensation, and can you explain why the inventory service must be idempotent even if the broker promises "exactly once"?

Cluster 5: Event Sourcing and CQRS

Order	Concept	Type	Focus
13	Event Sourcing: The Event Log Is the System of Record	PRIMARY	State as a fold over an append-only log
14	Projections and Read Models	PRIMARY	Views derived from the log, rebuildable on demand
15	CQRS: When to Separate Reads and Writes	SUPPORTING	Splitting the write model from the read model, and when not to

Cluster mastery check: Can you name three situations where event sourcing is the wrong answer, and one where CQRS without event sourcing is still a valid design?

Then work these practice pages:

Order	Practice path	Focus
1	Event Modeling Lab	Naming events, avoiding disguised commands, storming a domain
2	Messaging Patterns Workshop	Pub-sub vs queue, notification vs state transfer, outbox sketch
3	Saga and Idempotency Clinic	Compensations, retries, dedup keys, ordering
4	Event-Driven Katas	Repeatable drills for event design, outbox, choreography, CQRS

Use Module Quiz after the concept and practice path. Use Reference and Selective Reading and Learning Resources only for targeted reinforcement.

Learning Objectives

By the end of this module you should be able to:

Write an event name and payload that is an immutable past fact and resist the disguised-command trap.
Choose pub-sub vs point-to-point queueing for a concrete scenario and justify it in one paragraph.
Explain the dual-write bug and implement the outbox pattern to eliminate it.
Compare queue-based and log-based brokers by delivery, retention, and replay semantics.
Draw partitions and consumer groups on a whiteboard and predict who gets which messages.
Choose choreography or orchestration for a given workflow and defend the tradeoff.
Design a three-step saga with correct compensating transactions for the failure paths.
Make a consumer idempotent using a dedup key, and explain why "exactly once" is a systems-level property not a broker feature.
Decide whether event sourcing, CQRS, or neither is justified for a given bounded context.
Rebuild a read model from an event log and explain why that rebuildability is the point.

Outputs

one event catalog with at least 20 events named in past tense, each with payload fields and a producing service
one architecture sketch comparing queue and log-based brokers for a shared scenario
one outbox implementation sketch (schema, polling query, idempotency key flow)
one full saga diagram with happy path, one failure path, and all compensations
one CQRS vs single-model memo defending a real decision
one idempotency design for a real consumer, including the dedup store and key
one mistake log naming at least 10 errors such as command-shaped event, published before commit, missing dedup, choreography without observability, premature event sourcing

Completion Standard

You have completed Module 3 when all of these are true:

you can name an event in past tense without sneaking in a command
you can explain why publishing after a DB commit and publishing from an outbox are fundamentally different
you can pick queue vs log-based broker with a reason, not a brand preference
you can walk a saga through its happy path and at least one failure with correct compensations
you can tell the difference between event sourcing and "using events to communicate"
you can explain CQRS to someone who thinks it means "two databases"

If you can configure a broker but cannot say what an event is, the module is not complete.

Reading Policy

Concept pages are the main path.
Local book chunks are selective reinforcement, not a second syllabus.
Read only if stuck means try the concept page, self-check, and drill first.
Optional deep dive means additional nuance, not required progression.
External links to Fowler, microservices.io, and Kafka docs are used surgically where a local chunk is not enough.

Suggested Weekly Flow

Day	Work
1	Concepts 1-3 and the event-naming drill from Practice 1
2	Concepts 4-6 and the outbox sketch from Practice 2
3	Concepts 7-9 and the partition-and-consumer-group diagram kata
4	Concepts 10-12 and one full saga walkthrough from Practice 3
5	Concepts 13-15 and the CQRS-yes-or-no scenarios from Practice 4
6	Quiz, interleaved review, and mistake-log cleanup
7	Buffer and Feynman note on event-driven thinking

Reference

If you need exact links into the local chunked books, use Reference and Selective Reading.

Rich Learning Pages

Scope of This Module​

Before You Start​

Diagnostic Interpretation​

What This Module Is For​

Concept Map​

How To Use This Module​

Cluster 1: Events as a Mental Model​

Cluster 2: Messaging Patterns​

Cluster 3: Brokers and Log-Based Systems​

Cluster 4: Distributed Workflow with Events​

Cluster 5: Event Sourcing and CQRS​

Learning Objectives​

Outputs​

Completion Standard​

Reading Policy​

Suggested Weekly Flow​

Reference​

Rich Learning Pages​