Skip to main content

Partitioning Goals: Scaling Beyond One Node

What This Concept Is

Partitioning (also called sharding) means splitting one logical dataset across several nodes so that each node holds only a slice. Replication gives you many copies of all the data; partitioning gives you many pieces of one dataset. Real systems almost always combine both.

The canonical reasons to partition:

  • Scalability of storage: when one dataset exceeds the largest practical disk (or the largest practical backup window), it must be split.
  • Scalability of write throughput: one leader has a ceiling of writes per second. Partitioning by key spreads writes across N leaders, multiplying the ceiling roughly linearly.
  • Query parallelism: a query that must touch many rows can run in parallel across partitions (Map-Reduce-style) instead of sequentially on one node.
  • Failure isolation: losing one partition loses only a slice of data, not the whole service (assuming the slice is also replicated).

Partitioning turns "how big can one machine be?" into "how many machines can I afford?" That is a qualitatively different question, and it introduces new failure modes: uneven splits, hotspots, cross-partition queries, and rebalancing.

Why It Matters Here

Everything in Cluster 4 (partitioning strategies) is about answering "by what key?" and "using what scheme?" That cluster only makes sense if you have internalized that partitioning is not free. You gain throughput and storage; you pay with routing, cross-partition coordination, and a new class of operational tasks (rebalancing, splitting, merging).

Partitioning also changes how you think about writes: a transaction that was trivial on one node (a single INSERT) becomes a coordination problem (which node owns this key?) that must be solved before any row is written.

Concrete Example

Suppose a single PostgreSQL instance handles 10,000 writes/sec and stores 4 TB of data. The application's user base triples.

  • Without partitioning: vertical-scale the box (bigger CPU, faster disks) until it reaches its limit around 30k writes/sec and 10 TB. Then you are stuck.
  • With partitioning by user_id: 10 shards each handle 3,000 writes/sec. Total capacity scales to 100k writes/sec and 40 TB. Each shard is a normal Postgres you already know how to operate. Disaster recovery is per-shard (smaller blast radius).

The cost shows up the first time a query needs data across shards (e.g., "top 10 spenders globally"). That query now fans out to 10 nodes, merges, and sorts -- an operation that was free on the monolith.

Common Confusion / Misconception

"Partitioning replaces replication." No. Partitioning splits data; replication copies it. A partitioned-only system loses data when any shard dies. A real production cluster partitions and replicates each partition (usually 3 copies each).

"A good partition key is the most-read column." It is usually the most-written column, or a column derived from it. The goal is to spread write load evenly. Reads can be addressed with read replicas; hot-writes on one partition cannot be fixed by more reads.

How To Use It

When someone proposes partitioning, ask:

  1. Why: is the ceiling storage, writes/sec, query parallelism, or failure blast radius?
  2. What key? (random, user_id, tenant_id, geo, time range)
  3. Is access uniform across that key, or do hotspots exist (celebrity users, admin tenants, current timestamp)?
  4. How does the client find the right partition? Client-side hashing, a coordinator, or a routing layer?
  5. What happens when a shard fills or empties? (rebalancing plan)

Check Yourself

  1. Name a workload where partitioning helps most, and one where it barely helps.
  2. Why does partitioning fail to scale write throughput if the partition key has a hotspot?
  3. What is lost, operationally, the moment you go from 1 shard to N shards?
  4. Why can't read replicas replace partitioning when the dataset grows beyond one disk?

Mini Drill or Application

For each workload, say whether partitioning is the right first move, and propose one partition key:

  1. A photo-sharing service storing billions of thumbnails.
  2. A regulatory-reporting database of 200 GB with heavy analytical queries.
  3. A Twitter-like feed where celebrity users produce 100x the load of median users.
  4. A time-series metrics store ingesting one row per sensor per second for a year.
  5. A multi-tenant SaaS app with 50,000 tenants and one very large customer representing 30% of volume.

Read This Only If Stuck