Skip to main content

Choreography vs Orchestration

What This Concept Is

When a business process crosses multiple services, someone has to decide the order of steps, handle retries, and react to failures. There are two fundamental answers:

Choreography (broker topology)

No central coordinator. Each service subscribes to the events it cares about, does its part, and publishes further events. The workflow is implicit in the web of subscriptions.

Customer clicks Buy
|
v
[checkout]
|
v OrderPlaced
(broker)
/ \ \
v v v
[billing] [inventory] [notify]
| |
v v StockReserved
PaymentCaptured (broker)
(broker) |
\ v
\ [fulfillment]
\ |
\ v
-- OrderShipped --

No single box "owns" the workflow. Each service knows what events it consumes and what events it emits.

Orchestration (mediator topology)

A central orchestrator (AWS Step Functions, Temporal, Camunda, Netflix Conductor, a hand-rolled saga coordinator) drives the workflow. It sends commands to services and waits for replies.

         +----------------------+
| Orchestrator |
| (Temporal workflow) |
+-+--+--+-+--+---------+
| | |
v v v
[bill][inv][ship][notify]

The orchestrator knows the state machine and keeps track of where the workflow is.

Why It Matters Here

These are the two ways events, sagas, and multi-service processes combine. Every multi-service workflow you design will be one, the other, or (often) a hybrid. The cost of picking wrong is paid slowly, in debugging and deploys.

The Tradeoff Table

AxisChoreographyOrchestration
CouplingEach service coupled to events it emits/consumesEach service coupled to the orchestrator's commands
VisibilityPoor -- workflow exists only as a graph of subscriptionsGood -- one place shows the state machine
Adding a new stepAdd a subscriber to existing eventsChange the orchestrator definition
Adding a new consumer who only observesTrivial (subscribe)Irrelevant to orchestrator
Failure handlingEach service handles its own; correlation is hardOrchestrator sees failures and runs compensations
RetriesPer service; each invents its own policyCentralized, declarative
Timeouts / long delaysAwkward; requires scheduling eventsFirst-class (Temporal timers, Step Functions waits)
Debugging"Why didn't X happen?" requires correlating logsOne execution-history view per run
Operational dependencyBroker onlyBroker + orchestrator (SPOF-ish)
Best fitReactive integration, fan-out, small workflowsBusiness-critical workflows, long durations, explicit state

Concrete Examples

Choreography is right: domain-event fan-out

OrderPlaced is published. billing, inventory, notifications, and analytics each consume it independently. There is no "workflow"; there are four independent reactions. Putting an orchestrator in the middle would add cost for no coordination benefit.

Orchestration is right: onboarding a new customer

Steps: create CRM record -> provision SaaS tenant -> send welcome email -> schedule kickoff call -> report to analytics. Some steps take days (wait for kickoff call to complete). Failures require specific compensations (cancel provisioning). The workflow is a first-class thing that business people care about; they need to see its state.

Implementation with Temporal or Step Functions makes the state machine explicit, lets support reps see "stuck at step 3 waiting on provisioning," and handles retries, timers, and compensations declaratively.

Hybrid is very common

Choreography at the edges, orchestration at the critical core. Orchestrators publish events when major stages complete, so observers can subscribe without becoming participants.

Common Confusion / Misconception

"Choreography is more loosely coupled." It trades explicit coupling (orchestrator knows services) for implicit coupling (services know each other through events). Implicit coupling is not less -- it is less visible. Hidden coupling is often worse than explicit coupling because nobody maintains it.

"Temporal / Step Functions removes the need for events." No. Orchestrated workflows still emit events (completion events, progress events) for observability and reactive integration. The orchestrator handles coordination; events still carry facts.

"Choreography scales better." Sometimes. A broker fan-out can easily scale; but when workflows get complex, choreography's debugging cost grows super-linearly with participants, while orchestration's grows linearly.

"An orchestrator is a single point of failure." Modern orchestrators (Temporal, Step Functions) are themselves distributed and durable. The orchestrator's state is not fragile; what is fragile is the coupling of workflows to a specific orchestrator product, which is real but manageable.

"We can start with choreography and switch to orchestration later if we need to." Sometimes, but the switch is nontrivial: event schemas become orchestrator commands, subscribers become step handlers, and your whole "who owns the workflow" story inverts.

How To Use It

Decision guide:

If you choose choreography:

  1. Publish rich correlation IDs on every event and stitch traces.
  2. Invest in a "workflow viewer" dashboard that reads the event stream and shows "orders stuck at step X."
  3. Write down the implicit state machine in a document; do not leave it as tribal knowledge.

If you choose orchestration:

  1. Pick a substrate (Temporal, Step Functions, Camunda 8, Cadence) and commit.
  2. Keep service APIs idempotent; the orchestrator will retry.
  3. Emit events on stage completions so observers can subscribe without participating.
  4. Version the workflow definition; long-running instances outlive deploys.

Check Yourself

  1. In one sentence, what is the coupling that choreography hides and orchestration makes explicit?
  2. Name two situations where orchestration is clearly better, and two where choreography is clearly better.
  3. Why is "add a new observer" trivial in choreography and uninteresting in orchestration?

Mini Drill or Application

Take a workflow you know (order checkout, ride dispatch, loan approval, support ticket escalation). In 25 minutes:

  1. Sketch it as choreography (services, events, subscribers).
  2. Sketch it as orchestration (orchestrator, steps, commands).
  3. Name two things the choreography version hides from a support rep.
  4. Name two things the orchestration version makes expensive to change.
  5. Pick one for a real team and defend it in three sentences.

Transfer to Adjacent Domains

  • Sagas (Concept 11). The choreography/orchestration split is the saga variant split. This concept gives you the tradeoff table; Concept 11 gives you the failure-handling mechanics. You'll always use them together.
  • Business process / BPM heritage. Orchestration in microservices is the modern descendant of BPM (Camunda, jBPM). If your org has a BPM practice, Camunda-8 / Zeebe is the natural bridge; teams without BPM DNA usually prefer Temporal.
  • Observability (S8M5). Choreographed workflows demand excellent distributed tracing -- the only way to see the whole graph is to stitch spans by correlation ID. Orchestrated workflows have a built-in "tracing surface" in the orchestrator's history view.
  • Team topologies (S7M4). Choreography favors autonomous teams with minimal coordination; orchestration favors a designated workflow-owning team. Choosing the wrong shape for your org produces predictable friction (autonomous teams hating a central orchestrator team, or a "coordination" team that isn't actually empowered to own the workflow).
  • Serverless step-runner stacks. Step Functions, Workflows (GCP), Durable Functions (Azure) -- these are managed orchestrators with the same semantics as Temporal but different pricing/vendor lock-in curves. The concept transfers; the service choice is secondary.

Read This Only If Stuck