Skip to main content

Estimation and Framing Lab

Retrieval Prompts

  1. Name the three separate lists the framing step produces, in order. Why are they three and not one?
  2. State the formula for QPS from DAU and actions-per-user.
  3. From memory: the powers-of-two values for 10, 20, 30, 40. The Jeff Dean latency numbers for memory, SSD random read, intra-DC round trip, and cross-continent round trip.
  4. What is the 80/20 heuristic for cache sizing, in one sentence?
  5. Define "hard part" in the sense used in Cluster 1 concept 3.

Compare and Distinguish

Separate these cleanly:

  • functional requirement vs non-functional requirement vs constraint
  • latency vs throughput
  • read:write ratio vs cache hit ratio
  • bottleneck vs single point of failure
  • skew (in data distribution) vs hot key

For each pair, produce a one-sentence distinction you could state out loud on a whiteboard.

Common Mistake Check

For each statement, identify the error:

  1. "The system must support 1 billion users." (Missing what?)
  2. "P99 latency under 200 ms." (Acceptable as stated, or incomplete?)
  3. "We need a cache." (What makes this wrong as framing?)
  4. "Read:write is 10:1, so we need three read replicas." (What hidden assumption is this making?)
  5. "At 100× scale, we will shard." (Shard on what, why, and at what partition?)

Mini Application

Pick two prompts you have not used yet. For each, produce in 15 minutes:

  1. Three lists (functional, non-functional with numbers, constraints).
  2. Five estimation lines: QPS avg, QPS peak, storage/day, cache size, latency-budget breakdown.
  3. A ranked list of 2-3 hard parts.
  4. The single number that would most change the design if it were 10× larger.

Candidate prompts (or invent your own):

  1. Global distributed rate limiter for 100 K tenants.
  2. Photo sharing app with 500 M DAU and 3 photos/day uploaded.
  3. Real-time sports score feed for an app with 50 M DAU.
  4. Centralized API usage metering service for a multi-tenant cloud platform.

Numbers Drill (No Calculator)

Do these in under 10 minutes. Write order of magnitude only.

  1. 300 M DAU × 20 actions/day -> average QPS?
  2. 1 B posts/day × 2 KB each -> storage/day?
  3. 10 K QPS × 512 B request + 8 KB response -> bandwidth in MB/s each direction?
  4. 2 TB hot set × 3× replication -> total cluster memory needed?
  5. Latency budget of 150 ms with 30 ms network each way: how much budget remains for compute?
  6. 99.95% SLO -> minutes of allowed downtime per 30-day month?
  7. If your cache hit rate drops from 95% to 90%, what is the relative increase in origin load?
  8. Typical cross-continent round trip (Jeff Dean table) -> how many serial cross-continent hops can fit in a 400 ms P99 budget?

Evidence Check

This page is complete only if:

  • you wrote down five estimation lines for each of your two chosen prompts, without a calculator
  • you named 2-3 hard parts per prompt with one-sentence reasons
  • you can recite the powers-of-two row (10, 20, 30, 40) from memory
  • you identified at least one common-mistake statement in your own first-pass framing and corrected it