Reference and Selective Reading
You do not need to read the source books front-to-back for this module. Use the concept pages and practice pages first. Open these local chunks only when a specific idea is still unclear after the concept page, or when you want a second voice on the same mechanism.
Source Roles
| Source | Role | Why it is here |
|---|---|---|
| Designing Data-Intensive Applications (Kleppmann) | Primary teaching source | The module's backbone: Chapter 5 (Replication) and Chapter 6 (Partitioning) |
| Database Internals (Petrov) | Implementation support | Used when you need protocol-level detail: failure detection, leader election, consistency models |
| Distributed Systems: Concepts and Design (Coulouris et al.) | Classical treatment | Used for group communication, replication fundamentals, and gossip-based systems |
| Database System Concepts (Silberschatz et al.) | Relational textbook view | Used when the distributed-systems literature is overwhelming and a classical RDBMS framing helps |
Read Only If Stuck
Cluster 1: Why Replicate and Partition
- DDIA: Chapter 5 Replication (opening)
- DDIA: Chapter 6 Partitioning (opening)
- DDIA: The cost of linearizability
- DDIA: The truth is defined by the majority
- Database Internals: Tunable consistency
- Database System Concepts: Data partitioning
Cluster 2: Replication Topologies
- DDIA: Synchronous versus asynchronous replication
- DDIA: Setting up new followers
- DDIA: Use cases for multi-leader replication
- DDIA: Handling write conflicts
- DDIA: Multi-leader replication topologies
- DDIA: Writing to the database when a node is down
- DDIA: Limitations of quorum consistency
- DDIA: Sloppy quorums and hinted handoff
- DDIA: Detecting concurrent writes (part 1)
- Distributed Systems: Gossip architecture (part 1)
Cluster 3: Replication Mechanics
- DDIA: Implementation of replication logs
- DDIA: Problems with replication lag
- DDIA: Monotonic reads
- DDIA: Change data capture
- DDIA: Ordering and causality (part 1)
- Database Internals: Session models
- Database System Concepts: 23.4 Replication (part 1)
- Database System Concepts: Weak replication (part 1)
Cluster 4: Partitioning Strategies
- DDIA: Partitioning by key range
- DDIA: Partitioning by hash of key
- DDIA: Skewed workloads and relieving hot spots
- DDIA: Partitioning secondary indexes by document
- DDIA: Partitioning secondary indexes by term
- DDIA: Strategies for rebalancing
- DDIA: Operations: automatic or manual rebalancing
- Database Internals: Database partitioning
- Database System Concepts: Dealing with skew (part 1)
Cluster 5: Practical Systems
- DDIA: Membership and coordination services
- DDIA: Implementing linearizable systems
- DDIA: The truth is defined by the majority
- DDIA: Process pauses (part 1)
- Database Internals: Chapter 9 Failure detection
- Database Internals: Phi-accrual failure detector
- Database Internals: Chapter 10 Leader election
- Database Internals: Chapter 11 Replication and Consistency
Optional Deep Dive
Open these only after the concept pages feel comfortable. They are worth the time if you plan to operate a distributed database in production or read systems papers.
- DDIA: Detecting concurrent writes (part 2)
- DDIA: Detecting concurrent writes (part 3)
- DDIA: Chapter 9 Consistency and Consensus
- DDIA: Linearizability
- DDIA: What makes a system linearizable (part 1)
- DDIA: Sequence-number ordering (part 1)
- DDIA: Total order broadcast (part 1)
- DDIA: Fault-tolerant consensus (part 1)
- Database Internals: Sequential consistency
- Database Internals: Causal consistency
- Database Internals: Strong eventual consistency and CRDTs
- Database Internals: Leader role in Raft
Concept-to-Source Map
| Primary concept | Best source if stuck | Why this source |
|---|---|---|
| Replication goals: availability, throughput, geolocation | DDIA: Chapter 5 Replication | Names the three goals cleanly and refuses to conflate them |
| Partitioning goals: scaling beyond one node | DDIA: Chapter 6 Partitioning | Best narrative of what partitioning buys and costs |
| CAP intuition: partition-tolerance as mandatory | DDIA: The cost of linearizability | Treats CAP as a runtime choice, not a static label |
| Single-leader replication | DDIA: Synchronous versus asynchronous replication | Builds the single-leader picture and the sync/async axis in one pass |
| Multi-leader replication: conflicts, topologies, convergence | DDIA: Multi-leader replication topologies | The clearest all-to-all / ring / star comparison |
| Leaderless replication: quorums, read-repair, hinted handoff | DDIA: Sloppy quorums and hinted handoff | Single place that holds all three mechanisms together |
| Replication log formats | DDIA: Implementation of replication logs | Authoritative side-by-side of statement / logical / physical |
| Synchronous vs asynchronous replication | DDIA: Synchronous versus asynchronous replication | The honest treatment of the durability tradeoff |
| Replication lag and session guarantees | DDIA: Problems with replication lag | Names and explains every anomaly the module uses |
| Range vs hash partitioning | DDIA: Partitioning by hash of key | Best direct comparison; read the range chunk alongside |
| Secondary indexes: local vs global | DDIA: Partitioning secondary indexes by term | Completes the local/global picture with the harder side |
| Rebalancing, shard splitting, and hotspots | DDIA: Strategies for rebalancing | Rebalancing schemes compared without hand-waving |
| Case studies: PostgreSQL, MongoDB, Cassandra | Database Internals: Chapter 11 Replication and Consistency | Bridges per-system behavior to consistency vocabulary |
| Managing failover and split-brain | DDIA: The truth is defined by the majority | The majority-is-truth framing that every safe failover relies on |