Skip to main content

Case Studies: PostgreSQL, MongoDB, Cassandra

What This Concept Is

Three widely-deployed databases, three very different positions on the replication/partitioning map. Pinning each to this module's vocabulary is the clearest way to convert concepts into operational literacy.

  • PostgreSQL: single-leader, physical streaming replication, no native partitioning across nodes. Sharding is an add-on (Citus, pg_shard, application-level).
  • MongoDB: single-leader replica sets (per shard), automatic sharding with config servers holding the map, range or hash partitioning per collection.
  • Cassandra: leaderless, peer-to-peer, consistent hashing with vnodes, tunable consistency via (W, R, N).

Every concept in Clusters 1-4 instantiates into one choice in each of these systems.

Why It Matters Here

Concept pages describe the space of possible designs; these three systems are the reference points most real-world conversations happen against. When a job interview asks "how does MongoDB handle replication?", you should be able to name: topology (single-leader per replica set), log format (oplog, logical), sync mode (configurable via writeConcern), failover mechanism (Raft-based election).

Concrete Example -- The Concept Map Mapped

ConceptPostgreSQLMongoDBCassandra
Replication topologySingle-leaderSingle-leader per replica setLeaderless
Log formatPhysical WAL (stream) + optional logicalOplog (logical, per-document)Commit log + memtable flush (per-node)
Sync modeAsync default; semi-sync via synchronous_standby_namesConfigurable via `writeConcern: { w: 1majority
Partitioning (native)Declarative partitioning within one node; no cross-node shardingAutomatic sharding; config-server-managed mapConsistent hashing (token ring); vnodes
RoutingClient to primary directlymongos coordinatorAny node acts as coordinator (gossip)
Secondary index partitioning(single node)Local (per-shard)Local (per-node); plus SASI / SAI variants
FailoverManual or via Patroni/repmgrAutomatic via Raft election in replica setNo failover per se; any node can coordinate
Consistency modelRead-your-writes on primary; follower reads stalereadConcern tunable: local / majority / linearizableTunable; never linearizable without LWT
CAP under partitionCP (writes stop on minority side after failover)CP (minority side read-only; writes fail)AP (any replica accepts writes if CL met)

Common Confusion / Misconception

"MongoDB is eventually consistent." MongoDB replica sets are single-leader; reads from the primary are strongly consistent. Only secondary reads are stale. Across shards, consistency follows the per-shard rules.

"Cassandra does not support failover." Cassandra does not need failover because there is no leader. Any replica can serve a write (subject to consistency level). That is why AP systems scale writes linearly -- no single node is special.

"PostgreSQL can't shard." Native single-table-across-nodes sharding is absent, but Citus, Postgres-XL, and application-level shard maps all exist. Most production Postgres at scale uses vertical scaling + application-level sharding.

How To Use It

When an interviewer, colleague, or architecture document mentions a system, mentally map it to this table:

  1. What is its replication topology?
  2. What is its log format?
  3. What is its default sync mode, and what are the knobs?
  4. How does it partition (if at all)?
  5. What is the client's routing path?
  6. What consistency does it offer per read, per write, across a partition?

Then compare to the row in the table above.

Failover Side-by-Side

  PostgreSQL (with Patroni):         MongoDB Replica Set:              Cassandra:

Primary dies -> Primary dies -> Node dies ->
Patroni detects (~5s) Raft election (~12s default) gossip notices
Picks replica with Majority elects new primary reads/writes route
highest LSN Writes stall during election around the node
Demotes old primary Writes resume once new primary hints accumulate
if reachable elected on peers, delivered
Promotes new primary Async writes lost if not yet when node returns
Updates DNS / proxy replicated no stop-the-world

Check Yourself

  1. Which of these three systems has the shortest write-unavailability window on node failure?
  2. Why does MongoDB need a config-server replica set but PostgreSQL does not?
  3. What does Cassandra's CL=QUORUM mean in terms of (W, R, N)?
  4. Where do secondary indexes live in each system?

Mini Drill or Application

For each scenario, pick the best fit among PostgreSQL, MongoDB, and Cassandra and justify:

  1. A financial ledger requiring ACID transactions and <1 s failover.
  2. A multi-region user-profile store where writes must succeed even during a cross-region partition.
  3. A schema-flexible catalog for a fast-moving product team that wants sharding without operating one.
  4. A time-series telemetry store for 10M devices at 10k writes/sec total.
  5. A regulated healthcare system needing point-in-time recovery and detailed audit.

Read This Only If Stuck