Skip to main content

Synchronous REST/gRPC vs Asynchronous Events

What This Concept Is

Every service-to-service interaction is either synchronous or asynchronous. The choice is made per interaction, not per system, and the two styles compose freely.

  • Synchronous (REST or gRPC). The caller blocks until the callee responds (or times out). Request-response. Good for: read paths, low-latency commands, interactions where the caller genuinely cannot proceed without an answer.
  • Asynchronous (events/messages). The caller publishes a message and continues. One or more consumers handle it later. Good for: fan-out, write paths where downstreams do not need to be live, decoupling, tolerating partial outages.

The cost model:

DimensionSyncAsync
Latency perceptionimmediateeventual
Coupling between caller and calleetime and availabilityschema only
Failure modecascades unless isolatedback-pressure, dead-letter queues
Debugging a single requesteasier (stack trace across hops)harder (event correlation across consumers)
Adding new consumersrequires caller changetransparent

Why It Matters Here

Beginners (and most surviving monoliths-turned-microservices) default to sync for everything. That works until a downstream is slow, and then the whole system goes down. Advanced teams (and ones that have been paged enough) default to async for write paths and use sync only where latency requires it.

Concrete Example: Same Workflow, Two Styles

"Customer places order." Three services are involved: Orders, Payments, Inventory.

All-sync design:

Every hop is on the critical path. If Inventory is slow by 200ms, the user waits. If Payments is down, checkout is down.

Hybrid (sync for payment, async for everything else):

Payment is sync because the user needs confirmation. Inventory and Fulfillment are async because they do not need to block the user. Orders becomes resilient to Inventory and Fulfillment outages -- it only depends on the event bus being up.

Common Confusion / Misconception

"Async is always better." No. Async moves complexity to the consumer (idempotency, retries, dead-letter queues, out-of-order handling) and to the observability layer (harder to trace). Reads usually stay sync. User-visible commands where the user must see the result usually stay sync.

"Sync is simpler." In isolation, yes. At the system level, a chain of three sync calls has a failure probability approximately the sum of the individual failure probabilities, and a tail latency approximately the sum of the tail latencies. Simple behavior, bad properties.

How To Use It

For each interaction, ask:

  1. Does the caller need the answer to proceed for the user? -> sync.
  2. Are we fanning out to multiple consumers? -> async.
  3. Can the callee be down for a minute without breaking the user flow? -> async.
  4. Is the interaction purely internal bookkeeping (analytics, audit, projections)? -> async.
  5. Would a chain of more than 2 sync hops be required? -> break the chain with async.

Combine freely. A single user request may touch three sync services and emit five async events.

Choosing Between REST and gRPC (if Sync)

  • REST/HTTP+JSON: best for public APIs, for consumers you do not control, for browser consumption. Easy to debug with curl, widely supported.
  • gRPC: best for internal service-to-service, where both sides ship protos in lockstep, where latency matters, where streaming is useful. Binary, typed, fast, harder to debug by hand.

Many mature stacks use gRPC internally and REST at the edge (behind the gateway -- see concept 11).

Choosing an Event Backbone (if Async)

Three common families:

  • Kafka / Pulsar. Log-based. Ordered per partition, replayable, long retention. Good for event-sourced systems and high fan-out.
  • RabbitMQ / SQS. Queue-based. Competing consumers per queue, acknowledged delivery, dead-letter queues. Good for task queues.
  • Cloud pub/sub (Google Pub/Sub, AWS SNS+SQS, Azure Service Bus). Hosted, varies but typically queue-like with fan-out.

Pick based on replay needs, ordering needs, and operational preference. The contract (concept 08) is more important than the choice of broker.

Check Yourself

  1. Why does converting a 3-hop sync chain to async often improve both latency (p99) and availability?
  2. What complexity does async move onto the consumer?
  3. When is sync the right answer despite the cost?

Mini Drill or Application

Take a workflow you know (place order, post tweet, upload photo, etc.). List every service-to-service interaction. For each, mark sync or async with a one-line justification. Count the number of sync hops on the user-critical path. If it is more than 2, propose which hop to make async.

How This Sits In The Module

Concept 11 handles how sync calls are routed (discovery, gateway, BFF). Concept 12 handles how sync calls survive partial failure. S8 M3 goes deep on async patterns (event-driven architecture, sagas).

Choreography vs Orchestration (Preview of S8 M3)

When multiple services coordinate to complete a workflow, there are two structural patterns:

  • Choreography. Each service emits events; others react without a central coordinator. Example: Orders emits OrderConfirmed; Inventory reacts with reserve-stock, emits StockReserved; Fulfillment reacts with schedule-shipment. The workflow is implicit in the event chain. Loose coupling, easy to extend (add a new consumer, no one else changes), hard to debug (no single place says "this is the checkout flow").
  • Orchestration. A central coordinator (orchestrator or saga manager) calls each service in sequence. The workflow is explicit; debugging is easier; coupling to the orchestrator is tight.

Both patterns rely on async messaging. The choice is about where the workflow is represented. Most real systems blend both: choreography for loose capabilities (notifications, analytics), orchestration for user-critical workflows where you must see the whole thing (checkout saga). Chris Richardson's Saga pattern treats this in depth; it is the main subject of S8 M3.

Latency Math: Why Sync Chains Explode

A simple model: if each sync hop has an independent failure probability p and a p99 latency L, then a chain of N hops has:

  • Failure probability1 − (1 − p)^N ≈ N·p for small p. With p = 0.01 (99% single-hop success), N = 5 drops to 95% -- a 5x increase in end-to-end failure rate.
  • p99 latency ≈ sum of per-hop p99s. If every hop has 100ms p99, a 5-hop chain has ~500ms p99 even when everything is healthy.
  • Tail latency is worse: the end-to-end p99 is driven by the worst single hop's p99, plus the statistical addition. See Jeff Dean's "The Tail at Scale" for the foundational analysis.

Converting an async hop removes it from the user-critical path entirely; it does not add to end-to-end p99 at all. This is why the "no more than 2 sync hops" heuristic above is a hard ceiling, not a preference.

Read This Only If Stuck

Local chunks

External canonical references

Depth Path

  • Ben Stopford, Designing Event-Driven Systems -- the sequel book that goes deep on the async path. You will open this properly in S8 M3.
  • Martin Kleppmann, Designing Data-Intensive Applications, chapters 4 ("Encoding and Evolution") and 11 ("Stream Processing") -- the formal backbone for contracts over both transports.