Skip to main content

Idempotency, Exactly-Once Semantics, and Retries

What This Concept Is

A function f is idempotent if calling it twice has the same observable effect as calling it once: f(f(x)) = f(x). In distributed terms, an operation is idempotent if retrying it (because you didn't hear back) does not corrupt state.

Three delivery semantics you will see:

  • At-most-once: the operation is tried and if anything goes wrong, it is not retried. Possible outcome: never executed. Easy to implement, wrong for most business operations.
  • At-least-once: on uncertainty, retry until you get a success. Possible outcome: executed more than once. Easy to implement, wrong when the operation is not idempotent.
  • Effectively-once (sometimes called exactly-once): the operation is observable as having happened exactly once from the client's perspective, achieved by at-least-once delivery plus idempotent or deduplicated processing.

True exactly-once delivery is impossible in an asynchronous network with possible failures (Lamport/Mullender's result, and a restatement of the Two Generals' Problem). Exactly-once processing is achievable because the receiver can deduplicate using an idempotency key.

Why It Matters Here

Every cross-service call you will ever write has this question behind it. When you see a "retry on failure" line in code, you are making a semantics claim:

  • If the operation is idempotent (e.g., PUT /users/42 { ... } with full state): at-least-once is fine.
  • If the operation is naturally non-idempotent (e.g., POST /orders, POST /transfers): you must add an idempotency key or server-side dedup.
  • If the operation has external side effects (charge card, send email, decrement stock): idempotency is required unless you can tolerate duplicates.

This concept connects the entire "partial failure" story to API design. It is where the abstract concerns of the earlier clusters become product features.

Concrete Example: Idempotent Payment

Client wants to charge $100 to card X. A non-idempotent API:

POST /charges
body: { card: X, amount: 100 }

If the client retries on timeout, the server may process the request twice and charge the card twice.

An idempotent API using an idempotency key:

POST /charges
headers: Idempotency-Key: 8a6b-c1d0-...
body: { card: X, amount: 100 }

Server behavior:

  1. On receipt: look up the idempotency key in a fast store (Redis, DB).
  2. If not seen: process the charge, store (key -> response) with a TTL (e.g., 24h), return the response.
  3. If seen and still processing: wait for the first request to finish (or return "in progress").
  4. If seen and completed: return the stored response without reprocessing.

The client can retry freely. The server processes the charge exactly once per key, and returns the same response every time.

Stripe, AWS, GCP, and most serious APIs do this. The pattern is standardized in an Internet-Draft; see RFC draft: HTTP Idempotency-Key.

The End-to-End Argument

The idempotency guarantee is most reliably enforced at the application layer, not at the network or transport layer. TCP gives you at-least-once delivery of bytes; HTTP gives you at-most-once dispatch of a request; only your application knows what the semantic operation is and when two requests are "the same." This is the end-to-end argument (Saltzer, Reed, Clark 1984) applied to reliable delivery.

Common Confusion / Misconception

"Kafka has exactly-once semantics." Kafka's exactly-once mode (idempotent producer + transactional writes) achieves exactly-once processing for reads-then-writes within the Kafka ecosystem. The moment you bridge to an external system (send an email, debit a bank), you are back to at-least-once, and you need application-layer idempotency.

A second misconception: "HTTP PUT is idempotent, so I don't need idempotency keys for PUTs." PUT is idempotent in a stateless sense - PUTting the same value twice gives the same final state. But if PUT also has a side effect (appending to a log, incrementing a counter, allocating a resource), the side effect is not naturally idempotent, and you still need a key.

A third: "If I add idempotency keys to every API, I'm done." You also need to think about: how clients generate keys (UUID per intended operation), key TTL (long enough to cover all retries), key store durability (consistent with the response), and what to do on idempotency-key collision (usually: reject as a client bug).

How To Use It

When you design any write API:

  1. Decide whether the operation is naturally idempotent. If no (POST /orders, charge, notify), require an Idempotency-Key header.
  2. Spec the key: client-generated UUID, scoped to one resource, TTL at least 24h.
  3. Spec the server: cache the response with the key; return cached response on duplicate.
  4. Clients: generate one key per intended operation and reuse it for all retries of that operation. Don't reuse across distinct operations.
  5. Document at-least-once for everything upstream of your idempotency check (queues, load balancers, proxies).

Check Yourself

  1. Why is exactly-once delivery impossible in general?
  2. How does an idempotency key turn at-least-once delivery into effectively-once processing?
  3. Why is the end-to-end argument relevant here?
  4. Give one operation that is naturally idempotent and one that is not.

Mini Drill or Application

Design an idempotent "send-email" API. Include: request shape, idempotency key scope and lifetime, what happens on replay within the TTL, what happens on replay after TTL expires.

Read This Only If Stuck