Partitioning and Rebalancing Workshop
Design a partitioned layout for five contrasting workloads. This page is not a survey of partitioning schemes; it is a sharpening stone for the "pick a key, then defend it" reflex.
Retrieval Prompts
- State the two main partitioning schemes and the scan pattern each makes efficient.
- Name the problem consistent hashing solves that
hash mod Ndoes not. - State the difference between a local and a global secondary index.
- Name three strategies for rebalancing and which kinds of workloads each fits.
- State one reason a hash-partitioned key-value store can still have a hotspot.
Template: The Partitioning Design Memo
For every workload you design, produce a short memo with these fields:
Workload:
Partition key:
Scheme (range / hash / composite):
Rationale (one paragraph):
Hotspot risk (identify at least one):
Secondary indexes needed:
Index partitioning (local / global):
Rebalancing plan (adding a node):
Query pattern that will surprise us:
Workload 1: Time-Series Metrics
A monitoring system ingests 250k samples/sec across 5,000 metrics from 2M hosts. Queries are "show me metric X for host Y from 1 hour ago until now" and "aggregate metric X across all hosts over the last hour."
Design two candidate partitioning schemes. For each, identify the hotspot risk and the slow query.
One acceptable answer sketch:
- Candidate A: range-partition by
(metric_name, timestamp). Per-metric scans are local. Hotspot on the newest partition per metric. - Candidate B: hash-partition by
(metric_name, host_id)and range-partition within bytimestamp. Writes spread across hosts. Cross-host aggregates fan out.
Which is better, and why?
Workload 2: Multi-Tenant SaaS
50,000 tenants. Top 3 tenants each generate 100x the median tenant's write volume. Most queries are tenant-scoped. The application needs to honor "delete all of tenant X's data" as a GDPR primitive.
- Pick a partition key and scheme.
- Design for the three giants -- do they live on dedicated shards?
- What does "delete everything for tenant X" cost with your scheme?
- How do you rebalance when a new giant emerges?
Workload 3: Social Feed
A Twitter-like service with 500M users, power-law distribution of follower counts. A celebrity with 50M followers posts once per hour. The read pattern is "show me the latest 50 posts from everyone I follow."
- Which is worse: write fan-out (materializing every follower's timeline) or read fan-out (querying posts from every followee at read time)?
- How do you partition posts? By
user_id? Bypost_id? By time? - Where is the hotspot for a celebrity post, and how do you absorb it?
Workload 4: B2B Catalog Search
A product catalog for 1,000 retail partners. Each partner has 10k-1M products. Users search across all partners. Dominant query: WHERE name ILIKE '%query%' AND category = ? with sort by price.
- What partition key serves the common query?
- What secondary indexes do you need?
- Global or local indexes?
- How would Elasticsearch (term-partitioned) differ operationally from MongoDB (document-partitioned) here?
Workload 5: Financial Ticker
8,000 symbols, quotes arriving at 1k updates/sec per symbol during market hours. Queries are "get the last 1 minute for symbol X" (very frequent) and "scan all symbols for price > threshold" (less frequent but important).
- Partition key: symbol? timestamp? composite?
- Can you answer the "scan all symbols" query without fan-out?
- What happens at market open (every symbol goes hot simultaneously)?
Rebalancing Scenarios
For each, walk through what happens:
- A fixed-partition Couchbase-style cluster goes from 4 nodes to 5. Partition count stays 1024.
- A dynamic-partitioning HBase cluster's newest region hits 10 GB and splits.
- A Cassandra cluster using vnodes loses one node cleanly (heartbeat timeout).
- A MongoDB cluster adds one shard; the balancer starts migrating chunks.
- A hash-partitioned system changes its partition count from 128 to 256. What percent of keys move under naive
hash mod N? Under consistent hashing?
Common Mistake Check
- "Hash on
user_idavoids all hotspots." - "Range on
timestampis fine; we have lots of partitions." - "We can always add a shard later if load grows."
- "Global secondary indexes are fast, so use them everywhere."
- "A fixed partition count of 32 will be enough forever."
Evidence Check
This workshop is complete only if:
- You have written design memos (using the template) for at least three of the five workloads.
- You have diagnosed at least one hotspot in each memo and proposed a fix.
- You have walked through at least two rebalancing scenarios and described exactly what moves.
- You can explain, in one paragraph, how a single hot key cannot be fixed by any partitioning scheme and what the application must do instead.