Reference and Selective Reading

You do not need to read the source books front-to-back for this module. Use the concept pages and practice pages first. Open these local chunks only when a specific idea is still unclear after the concept page, or when you want a second voice on the same mechanism.

Source Roles

Source	Role	Why it is here
Designing Data-Intensive Applications (Kleppmann)	Primary teaching source	The module's backbone: Chapter 5 (Replication) and Chapter 6 (Partitioning)
Database Internals (Petrov)	Implementation support	Used when you need protocol-level detail: failure detection, leader election, consistency models
Distributed Systems: Concepts and Design (Coulouris et al.)	Classical treatment	Used for group communication, replication fundamentals, and gossip-based systems
Database System Concepts (Silberschatz et al.)	Relational textbook view	Used when the distributed-systems literature is overwhelming and a classical RDBMS framing helps

Read Only If Stuck

Cluster 1: Why Replicate and Partition

Cluster 2: Replication Topologies

Cluster 3: Replication Mechanics

Cluster 4: Partitioning Strategies

Cluster 5: Practical Systems

Optional Deep Dive

Open these only after the concept pages feel comfortable. They are worth the time if you plan to operate a distributed database in production or read systems papers.

Concept-to-Source Map

Primary concept	Best source if stuck	Why this source
Replication goals: availability, throughput, geolocation	DDIA: Chapter 5 Replication	Names the three goals cleanly and refuses to conflate them
Partitioning goals: scaling beyond one node	DDIA: Chapter 6 Partitioning	Best narrative of what partitioning buys and costs
CAP intuition: partition-tolerance as mandatory	DDIA: The cost of linearizability	Treats CAP as a runtime choice, not a static label
Single-leader replication	DDIA: Synchronous versus asynchronous replication	Builds the single-leader picture and the sync/async axis in one pass
Multi-leader replication: conflicts, topologies, convergence	DDIA: Multi-leader replication topologies	The clearest all-to-all / ring / star comparison
Leaderless replication: quorums, read-repair, hinted handoff	DDIA: Sloppy quorums and hinted handoff	Single place that holds all three mechanisms together
Replication log formats	DDIA: Implementation of replication logs	Authoritative side-by-side of statement / logical / physical
Synchronous vs asynchronous replication	DDIA: Synchronous versus asynchronous replication	The honest treatment of the durability tradeoff
Replication lag and session guarantees	DDIA: Problems with replication lag	Names and explains every anomaly the module uses
Range vs hash partitioning	DDIA: Partitioning by hash of key	Best direct comparison; read the range chunk alongside
Secondary indexes: local vs global	DDIA: Partitioning secondary indexes by term	Completes the local/global picture with the harder side
Rebalancing, shard splitting, and hotspots	DDIA: Strategies for rebalancing	Rebalancing schemes compared without hand-waving
Case studies: PostgreSQL, MongoDB, Cassandra	Database Internals: Chapter 11 Replication and Consistency	Bridges per-system behavior to consistency vocabulary
Managing failover and split-brain	DDIA: The truth is defined by the majority	The majority-is-truth framing that every safe failover relies on

Source Roles​

Read Only If Stuck​

Cluster 1: Why Replicate and Partition​

Cluster 2: Replication Topologies​

Cluster 3: Replication Mechanics​

Cluster 4: Partitioning Strategies​

Cluster 5: Practical Systems​

Optional Deep Dive​

Concept-to-Source Map​