Reference and Selective Reading
You do not need to read the source books front-to-back for this module. Use the concept pages and practice pages first. Open these local chunks only when you need alternate exposition, more worked examples, or a deeper exercise lane.
Source Roles
| Source | Role | Why it is here |
|---|---|---|
| Designing Data-Intensive Applications (Kleppmann) | Primary teaching source | Strongest modern narrative on distributed systems. Chapters 8 (The Trouble with Distributed Systems) and 9 (Consistency and Consensus) are the primary source material for this module |
| Distributed Systems Concepts and Design (Coulouris et al.) | Canonical textbook | The rigorous academic treatment of time, failure models, mutual exclusion, elections, consensus, and coordination services |
| Database Internals (Petrov) | Implementation-focused support | Operational view of failure detection, gossip, Paxos, Multi-Paxos, Raft, ZAB, and PBFT |
| Database System Concepts (Silberschatz et al.) | Peripheral | Limited direct treatment of these topics; used sparingly |
Read Only If Stuck
Cluster 1: The Inescapable Reality
- DDIA: Faults and Partial Failures
- DDIA: Cloud Computing and Supercomputing
- DDIA: Unreliable Networks
- DDIA: Network Faults in Practice
- DDIA: Synchronous Versus Asynchronous Networks
- DDIA: Process Pauses (Part 1)
- DDIA: Process Pauses (Part 2)
- Database Internals: Fallacies of Distributed Computing
- Database Internals: Two Generals' Problem
- Database Internals: System Synchrony
- Database Internals: Concurrent Execution
- Coulouris: Challenges (Part 1)
- Coulouris: Challenges (Part 2)
- Coulouris: Challenges (Part 3)
- Coulouris: Fundamental Models (Part 3)
Cluster 2: Time, Clocks, and Ordering
- DDIA: Unreliable Clocks
- DDIA: Clock Synchronization and Accuracy
- DDIA: Relying on Synchronized Clocks (Part 1)
- DDIA: Relying on Synchronized Clocks (Part 2)
- DDIA: Ordering and Causality (Part 1)
- DDIA: Ordering and Causality (Part 2)
- DDIA: Sequence Number Ordering (Part 1)
- DDIA: Sequence Number Ordering (Part 2)
- DDIA: Total Order Broadcast (Part 1)
- Database Internals: Clocks and Time
- Database Internals: Ordering
- Coulouris: Synchronizing Physical Clocks (Part 1)
- Coulouris: Logical Time and Logical Clocks
Cluster 3: Failure Detection and Membership
- Database Internals: Chapter 9 - Failure Detection
- Database Internals: Phi-Accrual Failure Detector
- Database Internals: Summary (Chapter 9 - Failure Detection)
- Database Internals: Chapter 12 - Anti-entropy and Dissemination
- Database Internals: Read Repair
- Database Internals: Merkle Trees
- Database Internals: Gossip Dissemination
- Database Internals: Hybrid Gossip
- Database Internals: Omission Faults
- Database Internals: PBFT Algorithm
- DDIA: Byzantine Faults
- DDIA: System Model and Reality
- Coulouris: Gossip architecture (Part 1)
Cluster 4: Consensus
- DDIA: Distributed Transactions and Consensus
- DDIA: Fault-Tolerant Consensus (Part 1)
- DDIA: Fault-Tolerant Consensus (Part 2)
- DDIA: Fault-Tolerant Consensus (Part 3)
- DDIA: Membership and Coordination Services
- Database Internals: Chapter 14 - Consensus
- Database Internals: Virtual Synchrony
- Database Internals: ZooKeeper Atomic Broadcast (ZAB)
- Database Internals: Paxos
- Database Internals: Quorums in Paxos
- Database Internals: Multi-Paxos
- Database Internals: Egalitarian Paxos
- Database Internals: Generalized Solution to Consensus
- Database Internals: Raft
- Database Internals: Leader Role in Raft
- Coulouris: Consensus and related problems (Part 1)
- Coulouris: Consensus and related problems (Part 2)
- Coulouris: Consensus and related problems (Part 3)
Cluster 5: Distributed System Patterns
- DDIA: The Truth Is Defined by the Majority
- DDIA: Membership and Coordination Services
- DDIA: The End-to-End Argument for Databases (Part 1)
- DDIA: The End-to-End Argument for Databases (Part 2)
- DDIA: Summary (Chapter 8 Part 2)
- Database Internals: Chapter 10 - Leader Election
- Database Internals: Bully Algorithm
- Database Internals: Coordination Avoidance
- Coulouris: Elections (Part 1)
- Coulouris: Data storage and coordination services (Part 1)
- Coulouris: Data storage and coordination services (Part 2)
Optional Deep Dive
- Database Internals: Generalized Solution to Consensus - the view that Raft and Paxos instantiate the same abstract problem.
- Database Internals: Virtual Synchrony - the older view-synchronous group communication tradition.
- DDIA: Total Order Broadcast (Part 2) - bridges consensus and replicated state machines.
- Database Internals: Read Repair / Merkle Trees - anti-entropy beyond gossip.
- Coulouris: Consensus summary - the section's closing argument (Byzantine generals).
Concept-to-Source Map
| Primary concept | Best source if stuck | Why this source |
|---|---|---|
| Eight fallacies | Database Internals: Fallacies of Distributed Computing | Cleanest compact list |
| Partial failure | DDIA: Faults and Partial Failures | The clearest single page |
| Asynchrony / slow vs dead | DDIA: Timeouts and Unbounded Delays | Motivates FLP without the formalism |
| Physical clocks / NTP limits | DDIA: Unreliable Clocks | Modern operational perspective |
| Lamport clocks and happens-before | Coulouris: Logical Time and Logical Clocks | Rigorous textbook treatment |
| Vector clocks | DDIA: Ordering and Causality (Part 2) | Ties vector clocks to causal consistency |
| Heartbeats / phi-accrual | Database Internals: Phi-Accrual Failure Detector | Operational algorithm in one page |
| Gossip / SWIM | Database Internals: Gossip Dissemination | The best concise treatment |
| Byzantine faults | DDIA: Byzantine Faults | Pragmatic take: when BFT applies |
| Why consensus | DDIA: Distributed Transactions and Consensus | The reductions in one page |
| Paxos | Database Internals: Paxos | Clearest algorithmic pass |
| Raft | Database Internals: Raft | Concise, implementation-aware Raft |
| Leader election / split-brain | DDIA: The Truth Is Defined by the Majority | Where fencing and leases belong |
| Idempotency / exactly-once | DDIA: The End-to-End Argument for Databases (Part 1) | The foundational argument |
| Coordination services | DDIA: Membership and Coordination Services | ZK/etcd/Consul abstraction explained |