Range Partitioning vs Hash Partitioning

What This Concept Is

Once you have committed to partitioning, you must pick a scheme that maps every key to a partition. Two schemes cover the vast majority of real systems:

Range partitioning: assign contiguous key ranges to partitions (A-F -> P0, G-M -> P1, N-S -> P2, T-Z -> P3). Preserves ordering. Range scans are local. Used by HBase, Bigtable, CockroachDB, and for time-series data in many systems.
Hash partitioning: hash the key with a well-distributed function and assign to partitions by hash(key) mod N or by ranges in the hash space. Destroys ordering but guarantees uniform load. Used by Cassandra, DynamoDB (under the hood), MongoDB (hashed shard keys), Redis Cluster.

Both have a third sibling worth knowing:

Consistent hashing: hash keys and nodes onto the same ring; each key is owned by the next node clockwise. Minimizes data movement on node add/remove. Used by Cassandra, Riak, Memcached.

  Range:                                Hash:
  
  P0 [A..F]  P1 [G..M]  P2 [N..S]       hash('alice') % 4 = 1 -> P1
  P3 [T..Z]                             hash('bob')   % 4 = 3 -> P3
                                        hash('carol') % 4 = 0 -> P0
  scans ordered by key within           keys appear scattered; scans
  each partition; cross-partition       by original key require fan-out
  scans possible

Why It Matters Here

This choice dictates:

Whether range scans are cheap (range wins) or a fan-out query across every shard (hash loses).
Whether writes can avalanche onto a single partition (range loses if the key is monotonic) or spread evenly (hash wins).
How rebalancing works: range splits partitions in half when they grow; hash requires a consistent-hashing scheme to avoid rehashing everything.

Almost every "my shard is on fire" incident is either "we hash-partitioned and it's a single hot key" or "we range-partitioned by timestamp and writes go to the last shard."

Concrete Example

A time-series metrics store stores rows keyed by (metric_name, timestamp).

Range by timestamp: current writes all land on the newest partition. That partition is a hotspot 100% of the time; old partitions are cold. Terrible.
Hash by (metric_name, timestamp): writes spread uniformly. Great for ingest. Terrible for "show me the last 1 hour of CPU" because that query now fans out to every partition.
Range by metric_name, then range by timestamp within each metric: a typical composite scheme. Per-metric scans are local. Write load spreads across metrics, but a single popular metric can still hotspot.
Salted key (range on hash_prefix + metric_name + timestamp): keeps the range ordering within a metric but adds artificial sharding.

Cassandra PRIMARY KEY ((metric_name), timestamp) with a hashed partition key is the canonical composite: hash on metric, range on timestamp within the metric.

Common Confusion / Misconception

"Hash partitioning kills hotspots forever." Only if the key space itself has no hotspots. hash(celebrity_user_id) still sends every celebrity-related write to one node. Hashing spreads keys, not load. Celebrity hotspots need application-level salting.

"Range partitioning is old." It is the right answer for ordered data (time series, log indexes, ranges of IDs). HBase and CockroachDB are built on range partitioning precisely because their workloads are range-heavy.

"Consistent hashing eliminates rebalancing." Only minimizes it. Adding one node still moves 1/N of the data. Node failure still requires moving the dead node's keys to living nodes.

How To Use It

Decision tree:

Does the application need range scans on the key (time-series, alphabetical listings, lexicographic prefixes)?
- Yes: range partitioning.
- No: consider hash.
Is the natural write key monotonic (timestamp, auto-increment ID)?
- Yes: hash on it, or salt it, or range on a different key.
Are some keys vastly hotter than others (celebrities, admin tenants, top SKUs)?
- Yes: add salt or use composite keys to spread that one key across partitions.
Will the cluster scale dynamically (add/remove nodes)?
- Yes: prefer consistent hashing over hash mod N.

Check Yourself

Why is "range partition by timestamp" almost always wrong for high-ingest time-series?
Why is "hash partition by user_id" insufficient for a social network with celebrity users?
What problem does consistent hashing solve that mod N does not?
What query pattern does range partitioning make fast that hash partitioning makes slow?

Mini Drill or Application

For each workload, pick a partition key and scheme, and name one hotspot risk:

A log-analytics system ingesting 200k events/sec from 50 services.
A B2B CRM with 1,000 tenants, where the top 3 tenants are each 100x the median.
A Twitter-like social feed keyed by (user_id, post_id).
A financial ticker storing per-symbol quotes for 8,000 symbols, mostly reads are "last 5 minutes for symbol X."
A public URL-shortening service where the key is the short code.

What This Concept Is​

Why It Matters Here​

Concrete Example​

Common Confusion / Misconception​

How To Use It​

Check Yourself​

Mini Drill or Application​

Read This Only If Stuck​