Skip to main content

Book Exercise Lanes

The concept pages in this module teach the material. The practice pages reinforce application. This page is for additional volume when you want more passes through the same chunk of source material.

Each lane maps to one of the five clusters and points at the chunks in library/raw/semester-06-databases-distributed/books/ that a normal graduate would work through for that topic.

How To Use This Page

  1. Finish the matching concept cluster and at least one of its drills first.
  2. Answer the target-outcome prompts from memory before opening any chunk below.
  3. Use the chunks to resolve your own disagreements or gaps, not as a recipe.
  4. Keep a mistake log. Useful tags: wrong quorum math, sync-vs-async mixed up, missed session guarantee, range vs hash confused, secondary-index scope wrong, rebalance not proportional, split-brain missed.

Lane 1: Why Replicate and Partition (Cluster 1)

Use this lane when the goal is clear prose: why do we replicate, why do we partition, and what is actually being traded off.

Target outcomes:

  • Written one-paragraph answers to "Why replicate?" and "Why partition?" without referring back to the chunks
  • Three concrete workload examples for each of: availability-first, throughput-first, and geolocation-first replication
  • Two examples where partitioning is the wrong tool (the bottleneck is not volume or write throughput)
  • A one-page CAP note explaining why every system "chooses" C or A only during an actual partition

Lane 2: Replication Topologies (Cluster 2)

Use this lane when you understand each topology in isolation but keep confusing them when failure modes change.

Target outcomes:

  • Three failure walk-throughs (one per topology) showing what happens to reads and writes
  • One concrete example where multi-leader is actually the right answer, and one where it is obviously wrong
  • Worked quorum examples for (N=3,W=2,R=2), (N=5,W=3,R=3), and one sloppy-quorum scenario
  • A one-paragraph note on why W+R>N is necessary but not sufficient for read-your-writes

Lane 3: Replication Mechanics (Cluster 3)

Use this lane when topologies are clear but the details of what flows over the wire, and how lag causes user-visible bugs, still feel fuzzy.

Target outcomes:

  • A one-page comparison of statement-based, logical (row-based), and physical log formats with one failure scenario each
  • Two drawn timelines, one async and one semi-sync, showing where a client-visible durability loss can occur
  • Three labeled replication-lag anomalies (stale read, non-monotonic read, causal violation) with the session guarantee that prevents each
  • A one-paragraph answer to "when is CDC the right tool and when is it an abuse of the replication log?"

Lane 4: Partitioning Strategies (Cluster 4)

Use this lane when you need many passes through real partition-key decisions and rebalancing scenarios.

Target outcomes:

  • A partition-key memo for five different workloads (time-series events, user profiles, group chat, social graph, analytics facts) with the chosen scheme and one failure mode
  • One worked hotspot example, identified first by metrics, then remediated with application-level splitting
  • One local-index and one global-index design for the same "find all orders by status" query, with query-cost and write-cost compared
  • A rebalancing plan for growing 4 nodes to 8 without a stop-the-world cutover

Lane 5: Practical Systems (Cluster 5)

Use this lane when the theory is solid and you want to see how real systems embody it.

Target outcomes:

  • A three-column table comparing PostgreSQL streaming replication, MongoDB replica sets, and Cassandra rings for: replication topology, log format, sync mode, failure detection, election, and what "strong read" means
  • A written walk-through of a split-brain scenario in a single-leader system, showing how fencing (or its absence) determines data loss vs. duplicate leaders
  • A one-page comparison of "the leader is slow" vs "the leader is dead" for each system, naming the exact mechanism that decides
  • Notes from reading one Jepsen analysis end-to-end, with at least three tagged violations and the replication mechanism that caused each

Self-Curated Problem Set

Pick at least one item from each section below:

  • Design: one partition-key decision for a new workload you encounter (your own, a blog post, a case study)
  • Diagnose: one stale-read story (yours or a public post-mortem) classified by anomaly and remediated with a session guarantee
  • Operate: one rebalancing plan for a cluster growing by at least 2x
  • Review: one Jepsen report summarized in five bullet points using the module's vocabulary

Completion Checklist

  • Completed at least one lane in full
  • Logged at least 10 real mistakes with tags and corrections
  • Wrote at least 6 full-paragraph operational justifications ("I chose X because... under failure Y it degrades to Z")
  • Re-attempted at least 3 previously failed drills after review
  • Produced at least one artifact from each of the four self-curated categories above
  • Can point, for every primary concept in the module, to one chunk in these lanes that makes it concrete