Learning Resources
This module is populated from the local chunked books in library/raw/semester-06-databases-distributed/books. Use this page as a source map, not as an instruction to read everything.
Source Stack
| Book | Role | How to use it in this module |
|---|---|---|
| Designing Data-Intensive Applications (Kleppmann) | Primary teaching source | Default escalation for replication topologies, log formats, partitioning, and failover |
| Database Internals (Petrov) | Implementation-focused support | Use for consistency models, failure detection, leader election, and tunable consistency at the implementation level |
| Distributed Systems: Concepts and Design (Coulouris et al.) | Classical distributed-systems framing | Use for group communication, gossip protocols, and replicated-data fundamentals |
| Database System Concepts (Silberschatz et al.) | Relational-replication view | Use for textbook treatments of partitioning and replication in traditional RDBMS contexts |
Resource Map by Cluster
Cluster 1: Why Replicate and Partition
| Need | Best local chunk | Why |
|---|---|---|
| Replication overview and goals | DDIA: Chapter 5 Replication | Kleppmann's canonical opening on the three replication goals |
| Partitioning overview | DDIA: Chapter 6 Partitioning | Best narrative introduction to what partitioning buys and costs |
| CAP honest framing | DDIA: The cost of linearizability | Treats CAP as a runtime choice, not a static label |
| Partition-tolerance and majorities | DDIA: The truth is defined by the majority | Strong framing of quorum as the basis of consistency |
| Tunable consistency and PACELC | Database Internals: Tunable consistency | The most honest treatment of "CAP is a spectrum" |
| Partitioning in relational systems | Database System Concepts: Data partitioning | Textbook framing that complements DDIA |
Cluster 2: Replication Topologies
| Need | Best local chunk | Why |
|---|---|---|
| Single-leader setup and sync | DDIA: Synchronous versus asynchronous replication | Cleanest presentation of single-leader mechanics |
| New-follower bootstrap | DDIA: Setting up new followers | Operational flow for a green replica |
| Multi-leader motivations | DDIA: Use cases for multi-leader replication | When to reach for multi-leader |
| Multi-leader conflicts | DDIA: Handling write conflicts | The conflict-resolution menu |
| Multi-leader topologies | DDIA: Multi-leader replication topologies | All-to-all, ring, star tradeoffs |
| Leaderless writes under failure | DDIA: Writing to the database when a node is down | Quorum and coordinator mechanics |
| Quorum limitations | DDIA: Limitations of quorum consistency | Why W+R>N is necessary but not sufficient |
| Sloppy quorums and hinted handoff | DDIA: Sloppy quorums and hinted handoff | The availability extension and its cost |
| Detecting concurrent writes | DDIA: Detecting concurrent writes (part 1) | Version vectors and conflict surfacing |
| Gossip architecture | Distributed Systems: Gossip architecture (part 1) | Classical case study of eventually-consistent replication |
Cluster 3: Replication Mechanics
| Need | Best local chunk | Why |
|---|---|---|
| Replication log formats | DDIA: Implementation of replication logs | The authoritative side-by-side |
| CDC from logical logs | DDIA: Change data capture | Shows why logical replication is the CDC substrate |
| Replication-lag anomalies | DDIA: Problems with replication lag | Every anomaly named and explained |
| Monotonic-reads guarantee | DDIA: Monotonic reads | Cleanest explanation of the guarantee and its mechanism |
| Causality and session guarantees | DDIA: Ordering and causality (part 1) | Connects anomaly vocabulary to happens-before |
| Session consistency models | Database Internals: Session models | Concrete session guarantees in running systems |
Cluster 4: Partitioning Strategies
| Need | Best local chunk | Why |
|---|---|---|
| Range partitioning | DDIA: Partitioning by key range | When range beats hash |
| Hash partitioning | DDIA: Partitioning by hash of key | When hash beats range |
| Hotspot mitigation | DDIA: Skewed workloads and relieving hot spots | Application-level salting and why schemes alone cannot help |
| Local secondary indexes | DDIA: Partitioning secondary indexes by document | Canonical scatter-gather explanation |
| Global secondary indexes | DDIA: Partitioning secondary indexes by term | Canonical term-partitioned explanation |
| Rebalancing strategies | DDIA: Strategies for rebalancing | Fixed, dynamic, proportional schemes compared |
| Rebalancing automation | DDIA: Operations: automatic or manual rebalancing | The "should we do this automatically" question |
| Relational-style partitioning | Database System Concepts: Data partitioning | Textbook treatment of data partitioning |
| Skew handling (textbook) | Database System Concepts: Dealing with skew (part 1) | Classical skew-handling techniques |
| Database partitioning overview | Database Internals: Database partitioning | Implementation-level framing |
Cluster 5: Practical Systems
| Need | Best local chunk | Why |
|---|---|---|
| Coordination services | DDIA: Membership and coordination services | How ZooKeeper/etcd fit the cluster |
| Majority-based truth | DDIA: The truth is defined by the majority | The core of safe failover |
| Linearizable implementations | DDIA: Implementing linearizable systems | Why consensus-based coordinators exist |
| Process pauses and safety | DDIA: Process pauses (part 1) | Why clock-based leases are unsafe |
| Failure detection | Database Internals: Chapter 9 Failure detection | How "is the leader dead?" is actually answered |
| Phi-accrual failure detector | Database Internals: Phi-accrual failure detector | Cassandra's failure detection |
| Leader election | Database Internals: Chapter 10 Leader election | Protocol-level treatment |
| Database Internals replication chapter | Database Internals: Chapter 11 Replication and Consistency | Bridge into consistency models |
External Resources (Validated)
These URLs were validated at the time of writing. Use them for primary-source material beyond the books.
Jepsen Analyses (Kyle Kingsbury)
Real-world correctness tests that expose replication and consistency violations under fault injection. The best training for "what goes wrong under a partition."
- Jepsen: Analyses index -- start here.
- Jepsen: MongoDB 4.2.6 -- even strongest read/write concerns failed to preserve snapshot isolation.
- Jepsen: MongoDB 3.6.4 -- sharded-cluster safety analysis.
- Jepsen: MongoDB 3.4.0-rc3 -- v0 replication protocol loses majority-committed writes.
- Aphyr: Jepsen Cassandra original post -- foundational write-up of the Dynamo-style consistency model.
Martin Kleppmann Blog
- Please stop calling databases CP or AP (2015) -- the essay behind this module's CAP framing.
Official System Documentation
- PostgreSQL: High Availability, Load Balancing, and Replication -- the canonical PG replication reference.
- PostgreSQL: Replication configuration parameters -- knobs for sync, lag, slot management.
- PostgreSQL: Streaming Replication Protocol -- the wire protocol.
- Cassandra: Dynamo architecture -- replication factor, consistent hashing, tokens.
- Cassandra Basics -- replication factor and consistency level.
- MongoDB: Replication -- replica sets overview.
- MongoDB: Replica Set Elections -- Raft-based primary election.
Primary Literature
- Dynamo: Amazon's Highly Available Key-value Store (SOSP 2007) -- the paper behind Cassandra, Riak, and most leaderless systems.
Exercise Support Chunks
Use these when the concept pages are understood but you need volume:
Use Rules
- If a concept feels shaky, go to DDIA Chapter 5 or 6 for the corresponding section first; they are the module's spine.
- If DDIA's narrative is clear but you want an operational picture, open Database Internals.
- Open one chunk for one concept gap. Do not read chapter sequences by default.
- Every primary concept in this module maps to at least one book chunk above. If you cannot find the mapping, re-read the concept page -- the gap is probably there, not in the source.
- Use Jepsen reports only after you have internalized the module vocabulary. They are unforgiving of vague thinking.