Event Sourcing: The Event Log Is the System of Record
What This Concept Is
Event sourcing is the design choice where the append-only log of events is the authoritative state of the system. There is no "current state" table that you UPDATE; whenever you need state, you fold the events.
current_state = fold(empty_state, events_for_aggregate)
Contrast with the CRUD shift (Concept 03): there, events were additionally published from a service that still kept a current-state DB. In event sourcing, the DB, if any, is derived. The log comes first, and every view is a projection (Concept 14).
Three properties define an event-sourced aggregate:
- State is derived from events. To load an order, read every event for that order ID, in order, and fold them into the state.
- Writes are appends to the event log. You never mutate; you append a new event describing the decision.
- The log is immutable and complete. You can rebuild any past state by folding up to a chosen point in time.
Why It Matters Here
Event sourcing is the strongest form of the "events as truth" mental model. It gives you:
- Complete audit trail by construction. Every state change is a fact with a timestamp and cause; there is nothing to bolt on.
- Time travel. State at any past moment is a fold up to that moment.
- Rebuildable read models. Drop any projection (Concept 14), re-fold the log, done. This is the superpower that makes CQRS (Concept 15) practical.
- Natural integration. Other services can subscribe to the log instead of polling your tables.
It also imposes real costs, which is why it is a specialized tool.
Concrete Example: An Order Aggregate
The event log for one order
events for order ord_9f2a:
evt_01 OrderPlaced {customer_id, items, total_cents}
evt_02 StockReserved {reservation_id}
evt_03 PaymentCaptured {txn_id, amount_cents}
evt_04 OrderShipped {carrier, tracking}
Loading the order's current state
def load_order(order_id):
events = event_store.read_stream(order_id) # ordered list
state = Order(status='pending', items=[])
for e in events:
state = apply(state, e) # fold
return state
def apply(state, e):
match e.type:
case "OrderPlaced":
return state.with_items(e.items, status='placed')
case "StockReserved":
return state.with_status('reserved')
case "PaymentCaptured":
return state.with_status('paid')
case "OrderShipped":
return state.with_status('shipped')
Making a decision
def handle_ship_command(order_id):
state = load_order(order_id)
if state.status != 'paid':
raise InvalidState("cannot ship before paid")
event = OrderShipped(order_id, carrier=..., tracking=...)
event_store.append(order_id, event, expected_version=state.version)
The expected_version check gives optimistic concurrency: two concurrent decisions cannot both append; one wins, the other retries with the new state.
Snapshots
For aggregates with long histories (years, thousands of events), folding every event on every load is expensive. A snapshot is a materialized state at version N; on load, read the snapshot + events after N. Snapshots are a performance optimization, not a source of truth.
The Pieces of an Event-Sourced System
The event store is the specialized persistence layer. It can be a dedicated database (EventStoreDB, Marten on Postgres, Axon, Kurrent) or a Kafka topic with infinite retention (with care about stream isolation).
When Event Sourcing Is the Right Answer
- Audit is a hard requirement, not a nice-to-have (regulated finance, healthcare, legal, insurance claims).
- Business analysts ask "what did we know on X date?" -- and they ask it often.
- Decisions depend on history (fraud, loyalty accrual, fee calculations based on the sequence of events).
- You need to rebuild or refactor views without touching source data.
- Multiple read models diverge in shape (CQRS, Concept 15).
- Temporal behavior matters: you care about ordering, when something became true, and versioning of historical decisions.
When It Is the Wrong Answer
- Simple CRUD with no historical interest. The value does not justify the cost.
- The team has no experience with it. Event sourcing is subtle: schema evolution, upcasters, dealing with "we got the event model wrong" three years in.
- Ad-hoc querying is the main access pattern. Projections (Concept 14) can answer expected queries; unexpected ones require rebuilding a view or scanning the log.
- You need immediate consistency across aggregates. Event sourcing is naturally per-aggregate; cross-aggregate consistency is handled via sagas, which adds complexity.
Common Confusion / Misconception
"Event sourcing is just logging." A log of "user clicked X" is not event sourcing. Event sourcing requires the events to be the authoritative domain facts from which state is derived, not a sidecar observation stream.
"I can always add event sourcing later." Retrofitting event sourcing into an existing CRUD system is massive. The commands, invariants, and schema must be rethought. It is usually easier to start greenfield or accept "events published from CRUD" (Concept 03) as your ceiling.
"Snapshots remove the need for the events." Snapshots are derived; you must be able to discard them and rebuild from events. If the snapshot is authoritative, you are no longer event-sourced.
"Schema evolution is easy -- just version events." It is manageable but not easy. Old events live forever; you need upcasters (functions that transform old-versioned events into new ones at read time). Budget for this.
"Event sourcing gives me CQRS for free." It enables CQRS naturally, but CQRS and event sourcing are independent choices (see Concept 15).
How To Use It
If you decide to use event sourcing in a bounded context:
- Model commands, events, and invariants explicitly. Events are domain facts; commands are requests to change state.
- Pick the aggregate boundary. An aggregate is a consistency boundary: one transaction appends events to one aggregate stream.
- Choose the event store. EventStoreDB, Marten, Axon, Kafka, or a DB + outbox as a low-cost start.
- Design for schema evolution from day 1: version field, upcasters, codec registry.
- Plan snapshots if aggregates can grow past ~1000 events.
- Publish events to the broker so consumers (other services, projections) can subscribe.
- Keep it local. Event sourcing is a per-bounded-context decision, not a system-wide one.
Check Yourself
- Why is the event log the source of truth, and what role does the current-state DB (if any) play?
- Why is schema evolution non-trivial in an event-sourced system?
- What does optimistic concurrency do when appending to a stream, and why is it necessary?
- Name three situations where event sourcing is overkill.
Mini Drill or Application
Pick one aggregate in a system you know (an order, a support ticket, a loan application). In 25 minutes:
- List the events that tell its story, past tense.
- Write the fold (pseudocode) that produces the current state.
- Write two commands and the validation they perform against loaded state.
- Identify one field whose history matters and explain why a CRUD table would have lost it.
Transfer to Adjacent Domains
- Projections & CQRS (Concepts 14, 15). Event sourcing is natural upstream of CQRS: the log is the write store, projections are the read models, and "rebuild from scratch" is a routine operation rather than a migration event. ES without CQRS is rare; CQRS without ES is common.
- DDD aggregates (S7M3). An event-sourced aggregate is the strong form of "aggregate as consistency boundary." Commands load the aggregate by folding its events, decide, and append new events. DDD and ES are independently invented and structurally the same shape.
- Regulatory / audit domains. Event sourcing is the default answer for jurisdictions that require full state-of-record history (banking, claims, medical, policy admin). The audit trail is the artifact, not a bolt-on.
- Data science & counterfactuals. A log of facts supports "what if we had priced X differently on date Y" analyses that CRUD databases simply cannot answer. ML feature stores are increasingly designed against event logs for exactly this reason.
- Schema evolution craft. Long-lived event logs force teams to treat event schemas as forever-public APIs. The "upcaster" tradition (functions that transform vN events into vN+1 at read time) is a specific skill; estimate it into the adoption cost.
Read This Only If Stuck
- Richards & Ford: Preventing Data Loss -- durability considerations that apply to event stores
- Richards & Ford: Event-Driven Architecture Style -- the broker/event substrate on which event sourcing sits
- Richards & Ford: Mediator Topology -- orchestrated command handling that often fronts event-sourced aggregates
- System Design Primer: Consistency patterns -- eventual-consistency setting that ES projections sit inside
- System Design Primer: Asynchronism -- the async substrate that carries appended events to projections and other services
- Martin Fowler: Event Sourcing -- canonical essay with diagrams and tradeoffs
- Martin Fowler: Event-Driven State -- the "events-as-state" lens of the four-lens essay
- Microservices.io: Event sourcing -- microservices-oriented pattern treatment
- Greg Young: CQRS Documents (PDF) -- original long-form write-up; sections on event storage and business value are evergreen
- Greg Young: CQRS and Event Sourcing (talk) -- the talk that defined the modern practice
- Confluent: It's Okay To Store Data In Kafka -- case for the log as a durable source of record
- EventStoreDB: What is event sourcing? -- concrete implementation framing from a purpose-built event store vendor
- Alberto Brandolini: EventStorming -- discovery workshop format that produces the event list an ES design starts from