Skip to main content

Distributed Architecture Fallacies Applied to Style Choice

What This Concept Is

The Fallacies of Distributed Computing -- attributed to L. Peter Deutsch and James Gosling at Sun Microsystems in the 1990s -- are eight assumptions people instinctively make about networks that are, in reality, false. Richards and Ford devote Chapter 9 to them as the backbone warning for all distributed styles.

The eight fallacies:

  1. The network is reliable.
  2. Latency is zero.
  3. Bandwidth is infinite.
  4. The network is secure.
  5. Topology doesn't change.
  6. There is only one administrator.
  7. Transport cost is zero.
  8. The network is homogeneous.

Each is a lie people tell themselves while diagramming. Each predicts specific, repeatable production failures in distributed systems.

Style choice implication

When you pick a distributed style (service-based, event-driven, space-based, microservices) you sign up for all eight fallacies becoming operational concerns. When you pick a non-distributed style (layered, pipeline, modular monolith), you do not. This is the main structural cost difference between those two groups.

Why It Matters Here

This concept is where "micro-optimizing style ratings" collides with "what will actually happen in production."

You can see three tempting style choices that survive the characteristics pass in Concept 13 -- and lose, one after another, when you apply the fallacies:

  • "Microservices because scale." Fallacy 1, 2, 3, 7 all apply; unless you have budgeted for retries, timeouts, circuit breakers, and data movement, the tax exceeds the benefit.
  • "Event-driven for coordination." Fallacy 1, 5 apply strongly. Lose one event or reorder two, and the system state drifts. Unless you have outbox/saga/idempotency, you are storing business bugs.
  • "Sync REST mesh across 12 services." Fallacy 2 plus fallacy 3 multiply: p99 latency becomes the sum of 12 p99s plus network, a distribution that stretches much further than sum of means.

If you are not ready for the fallacies, you are not ready for the distributed style.

Concrete Example

Let us walk each fallacy against a microservices system and name the predicted failure:

1. The network is reliable

  • Prediction: packets drop, TCP retries, connections reset.
  • Production failure: a service that makes a call without a timeout hangs indefinitely and backs up its caller. A cascade follows.
  • Mitigation: mandatory timeouts everywhere; circuit breakers; retries with backoff and jitter.

2. Latency is zero

  • Prediction: any call takes non-zero, variable time.
  • Production failure: a request that touches 6 services has p50 = 30ms, p99 = 2s, because p99s compound. Pagers fire at 2 AM.
  • Mitigation: budget latency across the call chain; use bulk endpoints; use async where the caller does not need an immediate answer; cache.

3. Bandwidth is infinite

  • Prediction: moving large payloads is not free.
  • Production failure: a GET /orders that returns 10k rows with 200 fields each fills the pipe and slows every other service sharing the network.
  • Mitigation: pagination; field selection; compression; data locality (put the callers near the callee).

4. The network is secure

  • Prediction: your intra-service traffic is interceptable unless you protect it.
  • Production failure: secrets leaked in plain-text logs; auth token theft across service hops; a misconfigured load balancer lets external traffic hit internal endpoints.
  • Mitigation: mTLS or equivalent; service identity; token propagation with scope; gateway discipline.

5. Topology doesn't change

  • Prediction: services come up, go down, move, scale, redeploy.
  • Production failure: hard-coded IPs break after rolling deploy; stale DNS caches point at dead pods; stateful sessions break during horizontal scale.
  • Mitigation: service discovery; no hard-coded addresses; connection draining on shutdown; idempotent retries.

6. There is only one administrator

  • Prediction: different teams manage different services with different policies.
  • Production failure: service A expects a 200 from service B; team B silently changes it to 204 during a rewrite. Contract breaks.
  • Mitigation: API versioning (M04); contract tests; governance (see M04 / M05); cross-team ADR review.

7. Transport cost is zero

  • Prediction: data movement has money cost (egress fees, serialization CPU, DB roundtrips).
  • Production failure: a cost report shows $40k/month in cross-AZ data-transfer fees from a chatty microservices mesh that was designed on a whiteboard where "services talk to each other" cost nothing.
  • Mitigation: colocate services; batch calls; cache; use cheaper wire formats (protobuf/gRPC); design for data locality.

8. The network is homogeneous

  • Prediction: different clients, firewalls, proxies, and networks behave differently.
  • Production failure: a customer on a mobile connection drops half your websocket events; a corporate proxy strips PATCH verbs; an HTTP/2 feature does not work on an older LB.
  • Mitigation: degrade gracefully; test on representative networks; use standard verbs; fallback paths.

Monoliths dodge all of this

A modular monolith (Concept 4) has none of these as operational concerns at the intra-application level. In-process calls do not drop, have nanosecond latency, and do not incur transport cost. That is not trivia -- it is the structural reason monoliths are cheaper to build right than distributed systems. The fallacies are the price tag of distribution.

Common Confusion / Misconception

"Modern infrastructure has solved these." Cloud VPCs, service meshes, and managed databases mitigate several fallacies. They do not remove them. Network is still not reliable; latency is not zero; bandwidth is not free. The tooling changes the constant, not the category.

"These apply to microservices, not event-driven." They apply to any architecture that crosses a process boundary. Event-driven is particularly exposed to fallacies 1 and 5 because events can be lost or reordered.

"This is paranoia." It is history. Every fallacy has killed a production system, publicly, multiple times. "Paranoia" is treating them as optional.

"My system is small; fallacies do not apply yet." Scale reveals them; it does not create them. They apply the moment you have two processes talking. At small scale you survive out of luck. At large scale you get paged.

How To Use It

For any proposed distributed style, print the fallacies next to the tax ledger (Concept 11). For each fallacy, answer: "What is our mitigation?" If you cannot answer one or more, the proposal is not ready.

This is a 1-page exercise per proposal. It catches most of the avoidable failures.

Check Yourself

  1. Name the eight fallacies in order if you can; in any order if you cannot.
  2. For each fallacy, state one concrete production failure it predicts.
  3. Which style in this module dodges all eight? Why?

Mini Drill or Application

Pick one distributed system you have worked on or know. In 20 minutes:

  1. For each fallacy, write 1-2 sentences: "where did this hit us?" or "how are we mitigating it?"
  2. Flag any fallacy with no mitigation.
  3. Propose one concrete fix for each flagged item.

Read This Only If Stuck