Skip to main content

Partitioning and Rebalancing Workshop

Design a partitioned layout for five contrasting workloads. This page is not a survey of partitioning schemes; it is a sharpening stone for the "pick a key, then defend it" reflex.

Retrieval Prompts

  1. State the two main partitioning schemes and the scan pattern each makes efficient.
  2. Name the problem consistent hashing solves that hash mod N does not.
  3. State the difference between a local and a global secondary index.
  4. Name three strategies for rebalancing and which kinds of workloads each fits.
  5. State one reason a hash-partitioned key-value store can still have a hotspot.

Template: The Partitioning Design Memo

For every workload you design, produce a short memo with these fields:

  Workload:
Partition key:
Scheme (range / hash / composite):
Rationale (one paragraph):
Hotspot risk (identify at least one):
Secondary indexes needed:
Index partitioning (local / global):
Rebalancing plan (adding a node):
Query pattern that will surprise us:

Workload 1: Time-Series Metrics

A monitoring system ingests 250k samples/sec across 5,000 metrics from 2M hosts. Queries are "show me metric X for host Y from 1 hour ago until now" and "aggregate metric X across all hosts over the last hour."

Design two candidate partitioning schemes. For each, identify the hotspot risk and the slow query.

One acceptable answer sketch:

  • Candidate A: range-partition by (metric_name, timestamp). Per-metric scans are local. Hotspot on the newest partition per metric.
  • Candidate B: hash-partition by (metric_name, host_id) and range-partition within by timestamp. Writes spread across hosts. Cross-host aggregates fan out.

Which is better, and why?

Workload 2: Multi-Tenant SaaS

50,000 tenants. Top 3 tenants each generate 100x the median tenant's write volume. Most queries are tenant-scoped. The application needs to honor "delete all of tenant X's data" as a GDPR primitive.

  • Pick a partition key and scheme.
  • Design for the three giants -- do they live on dedicated shards?
  • What does "delete everything for tenant X" cost with your scheme?
  • How do you rebalance when a new giant emerges?

Workload 3: Social Feed

A Twitter-like service with 500M users, power-law distribution of follower counts. A celebrity with 50M followers posts once per hour. The read pattern is "show me the latest 50 posts from everyone I follow."

  • Which is worse: write fan-out (materializing every follower's timeline) or read fan-out (querying posts from every followee at read time)?
  • How do you partition posts? By user_id? By post_id? By time?
  • Where is the hotspot for a celebrity post, and how do you absorb it?

A product catalog for 1,000 retail partners. Each partner has 10k-1M products. Users search across all partners. Dominant query: WHERE name ILIKE '%query%' AND category = ? with sort by price.

  • What partition key serves the common query?
  • What secondary indexes do you need?
  • Global or local indexes?
  • How would Elasticsearch (term-partitioned) differ operationally from MongoDB (document-partitioned) here?

Workload 5: Financial Ticker

8,000 symbols, quotes arriving at 1k updates/sec per symbol during market hours. Queries are "get the last 1 minute for symbol X" (very frequent) and "scan all symbols for price > threshold" (less frequent but important).

  • Partition key: symbol? timestamp? composite?
  • Can you answer the "scan all symbols" query without fan-out?
  • What happens at market open (every symbol goes hot simultaneously)?

Rebalancing Scenarios

For each, walk through what happens:

  1. A fixed-partition Couchbase-style cluster goes from 4 nodes to 5. Partition count stays 1024.
  2. A dynamic-partitioning HBase cluster's newest region hits 10 GB and splits.
  3. A Cassandra cluster using vnodes loses one node cleanly (heartbeat timeout).
  4. A MongoDB cluster adds one shard; the balancer starts migrating chunks.
  5. A hash-partitioned system changes its partition count from 128 to 256. What percent of keys move under naive hash mod N? Under consistent hashing?

Common Mistake Check

  1. "Hash on user_id avoids all hotspots."
  2. "Range on timestamp is fine; we have lots of partitions."
  3. "We can always add a shard later if load grows."
  4. "Global secondary indexes are fast, so use them everywhere."
  5. "A fixed partition count of 32 will be enough forever."

Evidence Check

This workshop is complete only if:

  • You have written design memos (using the template) for at least three of the five workloads.
  • You have diagnosed at least one hotspot in each memo and proposed a fix.
  • You have walked through at least two rebalancing scenarios and described exactly what moves.
  • You can explain, in one paragraph, how a single hot key cannot be fixed by any partitioning scheme and what the application must do instead.