Skip to main content

Idempotency, Deduplication, and the Exactly-Once Illusion

What This Concept Is

Idempotency: applying an operation twice has the same observable effect as applying it once. set_user_email(u, 'a@b.com') is idempotent. charge_card($10) is not -- two applications charge $20.

Deduplication: detecting that a given message is a duplicate of one already processed, and skipping it.

"Exactly-once": a loaded phrase. Strictly, exactly-once delivery is impossible in an asynchronous distributed system (there is always a failure window between "we processed it" and "we told the broker we processed it"). What is achievable is effectively exactly-once processing, through at-least-once delivery + idempotent consumers + deduplication.

The takeaway that matters most:

Regardless of what your broker claims, treat every message as potentially redelivered. Idempotency is the consumer's responsibility, and it is not optional.

Why It Matters Here

Every concept in this module generates duplicates:

  • outbox relay (Concept 06) retries on crash -> duplicate events published
  • Kafka at-least-once (Concept 09) delivers again on consumer crash -> duplicate events consumed
  • saga retries (Concept 11) re-send commands on timeout -> duplicate operations attempted
  • network retries anywhere -> duplicate HTTP calls

If your consumer is not idempotent, duplicates silently corrupt data: double billing, double shipping, triple welcome emails. The fix is at the consumer, not the broker.

Techniques

1. Naturally idempotent operations

Some operations are idempotent by design:

  • SET operations (updates to a specific value): set_status(order, 'shipped')
  • DB INSERT ... ON CONFLICT DO NOTHING keyed by a unique event ID
  • PUT on a keyed resource
  • state-machine transitions that only advance in one direction

Prefer this style when you can.

2. Deduplication by event ID

Store every handled event_id in a dedup table (or cache). On receipt:

def handle(event):
if dedup.seen(event.event_id):
return # already processed
with db.transaction():
apply_effects(event)
dedup.mark_seen(event.event_id)

Two things must be true for this to work:

  • apply_effects and dedup.mark_seen live in one transaction (otherwise you reintroduce a dual-write bug inside the consumer)
  • the dedup store is durable and has a retention window at least as long as the broker's max redelivery window

Storage options: a DB table processed_events(event_id PK, processed_at), a Redis set with TTL, or an embedded store like RocksDB in a stream processor. Scale the retention to the possible delay; a TTL of "max outbox lag + max consumer outage + margin" is a good rule of thumb.

3. Idempotency keys on external APIs

When you are producing a side effect (charging a card, calling a third-party API), pass an idempotency key. Stripe, Shopify, Plaid, and most serious APIs accept one:

POST /charges
Idempotency-Key: saga_9f2a_step_charge_v1
{ "amount": 4299, "currency": "usd" }

The upstream API guarantees that the same key produces the same outcome. Generate the key deterministically from saga_id + step_name so that a retry produces the same key.

4. Conditional updates (optimistic concurrency)

For mutable state, use WHERE version = ? or conditional writes (DynamoDB ConditionExpression, ETags). Duplicate writes become no-ops because the version has already advanced.

5. Ordering + monotonic state

For state that can only move forward (order pending -> shipped -> delivered), the consumer ignores any event that would move it backward. Combined with per-partition ordering, this is often enough.

A Worked Idempotency Pattern

Consumer handling PaymentCaptured to update a customer ledger:

def on_payment_captured(event):
with db.transaction():
# 1. dedup by event_id
inserted = db.execute("""
INSERT INTO processed_events (event_id, processed_at)
VALUES (%s, now())
ON CONFLICT (event_id) DO NOTHING
RETURNING event_id
""", event.event_id)
if inserted is None:
return # duplicate, skip

# 2. apply real effect (idempotent with the dedup insert)
db.execute("""
INSERT INTO ledger (order_id, amount_cents, entry_type)
VALUES (%s, %s, 'credit')
""", event.order_id, event.amount_cents)

Both the dedup insert and the ledger insert commit together. A second delivery finds the event_id already present and returns.

Why "Exactly-Once" Is an Illusion

Producers, brokers, and consumers form an asynchronous chain. At each hop:

  • producer writes to broker -> broker might ack after writing or before; producer might retry after crash
  • broker delivers to consumer -> consumer might crash after processing but before committing offset
  • consumer commits offset -> delivery attempt recorded in the broker, not in the consumer's downstream system

Even when Kafka supports "exactly-once semantics" (EOS) within Kafka (Kafka->Kafka with transactions), the moment you touch an external system -- a database, an HTTP API, an email -- you are back to at-least-once, because there is no atomic commit across Kafka and that external system. The solution is not stronger broker guarantees; it is idempotent consumers.

This is why the honest story is:

At-least-once delivery + idempotent processing = effectively exactly-once.

Anyone selling you anything stronger is usually selling you "exactly-once within a narrow boundary."

Common Confusion / Misconception

"Kafka's EOS means I do not need idempotency." Only if every downstream effect is also inside the same Kafka transaction (typical Kafka Streams topology). The moment you write to a DB, call an API, or send an email, you need idempotency.

"We retry forever, so eventually it works." Retries without idempotency cause multiple commits, not eventual success. Retry of a non-idempotent operation is a bug amplifier.

"We only see duplicates under failure, which is rare." You see duplicates whenever offsets commit late, consumers rebalance, the outbox relay restarts, or a producer retries after a transient broker blip. These happen daily at scale.

"Dedup cache is fine, TTL of 5 minutes." Too short. Set it to the longest redelivery horizon: broker retention for consumer-side dedup, plus worst-case outage. One hour is a reasonable floor for most systems.

"Ignore duplicates in code review; downstream will sort it out." Downstream is you, tomorrow, debugging double charges.

How To Use It

Per-consumer checklist:

  1. Is the effect naturally idempotent? If yes, great; document it.
  2. If not, pick a dedup key. Usually event_id. Ensure producers always include it.
  3. Where is the dedup store? DB table, Redis set, stream-processor state store.
  4. Is dedup in the same transaction as the effect? Must be.
  5. What is the dedup TTL? At least as long as worst-case redelivery.
  6. What happens to duplicates across operation types? (A ReleaseStock retried after success should be harmless; add a status check.)
  7. For external APIs, is an idempotency key available? Use it; generate deterministically from saga_id + step_name.

Check Yourself

  1. Why does "at-least-once + idempotency" give the same observable behavior as "exactly-once"?
  2. What must be true about the dedup store and the effect's write for idempotency to hold?
  3. Name three naturally idempotent operations and one that fundamentally is not.
  4. What is the right TTL for a dedup cache, and why?

Mini Drill or Application

Take a non-idempotent consumer you can name (sends a welcome email, charges a card, creates a record). In 20 minutes:

  1. Pick a dedup key.
  2. Choose a dedup store with justified TTL.
  3. Write pseudocode for the handler with dedup and effect in one transaction.
  4. Identify any downstream call that also needs its own idempotency key (e.g., the email provider's API).

Transfer to Adjacent Domains

  • Outbox (Concept 06). Outbox relay retries cause the duplicates that idempotency absorbs; the two patterns are engineered together, not independently.
  • Payments / billing integrations. Stripe, Adyen, Shopify all expose Idempotency-Key headers because they've learned the same lesson: duplicates are the norm, and the fix is on the caller side. Any financial integration should derive keys deterministically from saga step identity.
  • HTTP API design (S8M4). REST's PUT is idempotent by contract; POST is not. API designers build idempotency into their resource verbs -- the event-bus equivalent is idempotency in consumer handlers. Same principle, different layer.
  • Operational postmortems. "Double billing" and "duplicate email" postmortems trace to exactly one root cause 95% of the time: a consumer that assumed exactly-once delivery. Make idempotency a mandatory PR-review item for new consumers.
  • Testing. Idempotency is fuzz-testable: deliver every event twice in integration tests; the resulting state must match the single-delivery state. If your tests don't exercise this, they don't exercise the production path.

Read This Only If Stuck