Scaling Design Workshop
Retrieval Prompts
- State the defining difference between vertical and horizontal scaling and give one failure-mode implication of each.
- Define stateless service and name the two most common ways a service that looks stateless is actually stateful.
- State the three most common session-affinity strategies and one drawback of each.
- Name the four cache update patterns and the dominant failure mode of each.
- State at least three caching layers between the user's keystroke and the database row.
Compare and Distinguish
- scale up vs scale out
- stateless vs sticky vs external session store
- cache-aside vs write-through vs write-behind
- CDN cache vs reverse-proxy cache vs application cache
- cache invalidation vs TTL expiry
- thundering herd vs cache stampede (same thing, or different?)
Common Mistake Check
For each, identify the error:
- "We made the service horizontally scalable by adding a load balancer."
- "Sticky sessions are fine as long as the load balancer is smart."
- "Write-behind is safe because we eventually write to the DB."
- "We have a CDN, so we don't need any other caching."
- "The cache was slow so we doubled its memory."
Stateful-Component Audit
Pick an application you know. Walk the request path from client -> CDN -> API -> service -> DB. For every component, answer:
- Does it hold per-user state between requests? What kind?
- If the current instance disappears, what does the user notice?
- What would it take to make this hop truly stateless (or confirm it already is)?
Cache Design Exercise: Product Catalog
You run a product-catalog API. Key facts:
- 5 million products.
- 99% of traffic hits the top 50,000 products.
- Products change rarely (a few hundred updates/day).
- Price is included in the product payload and updates must be reflected within 60 seconds.
- P99 without cache: 250ms. SLO target: 50ms.
Design:
- What layers of cache do you use? (CDN? Reverse proxy? App-level? DB-level?)
- Which update pattern (cache-aside, write-through, write-behind, refresh-ahead) for each layer, and why?
- Where and how do you invalidate on a price change?
- How do you avoid a thundering herd when a hot product expires?
- What do you do on cache-server failure? (Fail open? Fail closed? Fallback?)
- What SLI/SLOs would you instrument to validate this design?
Session-Management Redesign
A legacy monolith stores sessions in the web-server's local memory. The team wants to scale horizontally and deploy zero-downtime.
- List everything that breaks about the current design.
- Propose three alternative designs: (a) pure stateless JWT, (b) external session store, (c) sticky sessions with failover. For each, give one pro, one con, and one non-obvious failure mode.
- Pick one and write the migration plan in 5 bullet points.
Horizontal Capacity Plan
A stateless API server serves 200 req/s at 50ms mean latency on a single instance. Each instance has 2 vCPU, 4 GB RAM. You expect peak load of 5,000 req/s with 3x peak-to-mean. You want 30% headroom.
- How many instances at peak?
- How many instances at steady-state mean?
- What do you set the HPA min/max to?
- Apply Little's Law: what is the average concurrency per instance, and how does it compare to the thread-pool size you should configure?
- What signal would tell you the capacity model is wrong?
Evidence Check
This practice page is complete only if you can:
- Justify, in 60 seconds, why a team should pick stateless-with-external-store over sticky sessions for a new service.
- Sketch a three-layer cache with update patterns and invalidation paths for a real system.
- Translate a request rate and latency into an instance count with headroom.
- Spot a hidden stateful dependency in what looks like a stateless tier.