Skip to main content

Scaling Design Workshop

Retrieval Prompts

  1. State the defining difference between vertical and horizontal scaling and give one failure-mode implication of each.
  2. Define stateless service and name the two most common ways a service that looks stateless is actually stateful.
  3. State the three most common session-affinity strategies and one drawback of each.
  4. Name the four cache update patterns and the dominant failure mode of each.
  5. State at least three caching layers between the user's keystroke and the database row.

Compare and Distinguish

  • scale up vs scale out
  • stateless vs sticky vs external session store
  • cache-aside vs write-through vs write-behind
  • CDN cache vs reverse-proxy cache vs application cache
  • cache invalidation vs TTL expiry
  • thundering herd vs cache stampede (same thing, or different?)

Common Mistake Check

For each, identify the error:

  1. "We made the service horizontally scalable by adding a load balancer."
  2. "Sticky sessions are fine as long as the load balancer is smart."
  3. "Write-behind is safe because we eventually write to the DB."
  4. "We have a CDN, so we don't need any other caching."
  5. "The cache was slow so we doubled its memory."

Stateful-Component Audit

Pick an application you know. Walk the request path from client -> CDN -> API -> service -> DB. For every component, answer:

  1. Does it hold per-user state between requests? What kind?
  2. If the current instance disappears, what does the user notice?
  3. What would it take to make this hop truly stateless (or confirm it already is)?

Cache Design Exercise: Product Catalog

You run a product-catalog API. Key facts:

  • 5 million products.
  • 99% of traffic hits the top 50,000 products.
  • Products change rarely (a few hundred updates/day).
  • Price is included in the product payload and updates must be reflected within 60 seconds.
  • P99 without cache: 250ms. SLO target: 50ms.

Design:

  1. What layers of cache do you use? (CDN? Reverse proxy? App-level? DB-level?)
  2. Which update pattern (cache-aside, write-through, write-behind, refresh-ahead) for each layer, and why?
  3. Where and how do you invalidate on a price change?
  4. How do you avoid a thundering herd when a hot product expires?
  5. What do you do on cache-server failure? (Fail open? Fail closed? Fallback?)
  6. What SLI/SLOs would you instrument to validate this design?

Session-Management Redesign

A legacy monolith stores sessions in the web-server's local memory. The team wants to scale horizontally and deploy zero-downtime.

  1. List everything that breaks about the current design.
  2. Propose three alternative designs: (a) pure stateless JWT, (b) external session store, (c) sticky sessions with failover. For each, give one pro, one con, and one non-obvious failure mode.
  3. Pick one and write the migration plan in 5 bullet points.

Horizontal Capacity Plan

A stateless API server serves 200 req/s at 50ms mean latency on a single instance. Each instance has 2 vCPU, 4 GB RAM. You expect peak load of 5,000 req/s with 3x peak-to-mean. You want 30% headroom.

  1. How many instances at peak?
  2. How many instances at steady-state mean?
  3. What do you set the HPA min/max to?
  4. Apply Little's Law: what is the average concurrency per instance, and how does it compare to the thread-pool size you should configure?
  5. What signal would tell you the capacity model is wrong?

Evidence Check

This practice page is complete only if you can:

  • Justify, in 60 seconds, why a team should pick stateless-with-external-store over sticky sessions for a new service.
  • Sketch a three-layer cache with update patterns and invalidation paths for a real system.
  • Translate a request rate and latency into an instance count with headroom.
  • Spot a hidden stateful dependency in what looks like a stateless tier.