Module Quiz
Complete this quiz after finishing all concept and practice pages.
Current Module Questions
Question 1: Requirements Discipline
You are asked to "design a notifications service." What are the first three things you must explicitly ask before any design?
Answer: (1) Who sends, who receives, and through what channels (functional scope); (2) scale and latency targets: DAU, notifications/user/day, P99 delivery latency, durability (non-functional with numbers); (3) constraints: mobile push vs email vs SMS, regulatory requirements, existing infrastructure.
Question 2: Estimation Drill -- Feed Writes
A service has 100 M DAU; each user writes 5 items per day; each item is 500 bytes. Estimate (a) writes per second average, (b) raw storage per day, (c) raw storage per year.
Answer:
(a) 100 M × 5 / 86,400 ≈ 5,800 writes/sec average (~5.8 K/s; peak at 3× ≈ 17 K/s).
(b) 100 M × 5 × 500 B = 250 GB/day.
(c) 250 GB × 365 ≈ 91 TB/year raw (before replication/indexes); with 3× replication and ~30% index overhead, call it ~360 TB/year.
Question 3: Estimation Drill -- Cache Sizing
Your read workload is 80/20 skewed. You want 95% cache hit rate on the last 24 hours of content, where 24 hours = 1 TB raw of items. Roughly how much cache memory do you need?
Answer: The 80/20 rule says ~20% of items serve ~80% of requests; to get to 95%, you cache somewhat more. A reasonable heuristic is 25-35% of the hot period, so ~250-350 GB of distributed cache. State the heuristic; the exact number is less important than showing you can derive it.
Question 4: Estimation Drill -- Latency Budget
P99 target 200 ms. Client to edge: ~30 ms. Cross-continent round trip: ~150 ms. How many cross-continent round trips can the request path afford?
Answer: Zero on the hot path. Even one cross-continent RTT eats 150 ms and leaves only 20 ms for everything else including client-to-edge. Cross-continent calls must be avoided on the user-visible request path; use async replication, regional caches, or region-local writes with eventual cross-region sync.
Question 5: Pick the Tradeoff -- Storage
You are designing a time-series metrics pipeline, 1 M data points per second, 100 B each, with mostly-append writes and aggregate reads over recent windows. What storage would you choose and what would you reject?
Answer: Choose a time-series or columnar store (InfluxDB, ClickHouse, TimescaleDB, or Prometheus for short-retention). Reject OLTP SQL because the write volume (~100 MB/s) will saturate B-tree indexes and reads will sweep enormous row ranges; reject a generic document store because aggregate reads over billions of tiny points need columnar layout to be fast. Cost: less flexibility in ad-hoc schema evolution; mitigated because the schema is narrow.
Question 6: Pick the Tradeoff -- Consistency Model
Two users race to reserve the last seat on a flight. What concurrency mechanism do you pick, what do you reject, and why?
Answer: Pick optimistic concurrency with a version column on the seat row: UPDATE seats SET held_by=?, version=version+1 WHERE seat_id=? AND version=?. Reject last-writer-wins (would double-allocate) and reject pessimistic locking (would block and create a stampede). Cost of OCC: one retry on conflict for the loser; in a 10-seat race among 10K users, most requests fail fast with "taken", which is the correct UX anyway.
Question 7: Partitioning Key
A social feed stores one row per (user, post) in a user_timeline table. Should you partition by user_id or by author_id? Why?
Answer: Partition by user_id. Reads are always "give me user X's timeline", which must hit one partition. author_id would concentrate load on celebrity authors (hot partitions) and scatter each user's timeline read across many partitions, which is the opposite of what you want.
Question 8: Cache Placement Critique
A junior engineer proposes putting a cache in front of a billing ledger to "speed up reads". Why is this a correctness risk, and under what condition could it be acceptable?
Answer: The ledger is a source of truth for money; a cache that can return stale balances can cause a user to overdraft or double-spend if downstream logic acts on the cached view. It is acceptable only if (a) the cache is explicitly a read-only summary for display purposes, (b) all money-moving operations read directly from the ledger, and (c) the UX makes clear which numbers are cached vs authoritative.
Question 9: Bottleneck vs SPOF
A centralized ID-generation service runs on a single instance. It serves 5 K QPS at 10% CPU. Is it a bottleneck? Is it a SPOF? What do you do?
Answer: It is not currently a bottleneck (plenty of CPU). It is a SPOF because its death blocks all new writes system-wide. Remediation: either make it highly available (active-passive with fast failover) or redesign as distributed range-allocation so that instance death has bounded blast radius (each service pre-allocates ID ranges and can operate from its range for minutes without the allocator).
Question 10: 10x Scale Reasoning
Your system handles 10 K QPS today with a single Redis cache (50 GB RAM, 80% hit rate). At 10× traffic (100 K QPS, same workload shape), what breaks first and what do you change?
Answer: The cache breaks first: with 100 K QPS and the same hit rate the origin sees 20 K QPS, up from 2 K, which may saturate the DB; and Redis throughput per node caps well below 100 K QPS for non-trivial commands. Remediation: shard Redis across 3-6 nodes with consistent hashing; grow hot set to 100-200 GB; add a short-TTL edge or per-instance cache to absorb very-hot keys; re-check DB with the new cache-miss rate.
Question 11: Failure Walk -- AZ Outage
Your feed service has app servers and a cache in every AZ of one region, plus a primary DB in AZ-1 with a synchronous replica in AZ-2. AZ-1 goes dark. What happens, and what is your RTO/RPO?
Answer: App servers in AZ-1 and their local cache die (load balancer ejects them); traffic shifts to AZ-2/AZ-3 stateless tiers. The DB primary in AZ-1 is lost; AZ-2's synchronous replica is promoted. RPO ≈ 0 (synchronous). RTO ≈ 30-120 seconds (health-check-driven failover and promotion). Warm-cache tiers in AZ-2/AZ-3 absorb traffic; origin sees a spike until cold-cache users in AZ-1 are rerouted.
Question 12: Pick the Tradeoff -- Fan-out
For a chat app with groups of 1-100,000 members, compare "fan-out on write to every recipient's mailbox" vs "fan-out on read by joining on the group". Which do you pick for which group sizes?
Answer: For small groups (say < 1000), fan-out on write: the per-message write amplification is bounded, and reads become cheap keyed lookups. For very large groups ("channels"), fan-out on read: writing 100 K rows per message is wasteful, and members who never read never pay the cost. The hybrid is to threshold on group size, pay fan-out write for small groups, and use a read-time view for large channels. Cost of the hybrid: two code paths; mitigated by a shared abstraction.
Question 13: Trade-off Articulation
Write a single-sentence trade-off statement in the form "I chose A over B because C, accepting cost D" for the decision "use cache-aside with explicit invalidation".
Answer: "I chose cache-aside with explicit invalidation over write-through caching for the profile store because write-through couples every profile write to the cache tier's availability and we cannot afford that coupling, accepting a brief staleness window after a write until the next reader repopulates the cache."
Question 14: Four-Phase Discipline
You are 22 minutes into a 45-minute interview and still in phase 2 (high-level diagram). What do you do?
Answer: Stop drawing. Say out loud: "Let me close out high-level and move to deep-dive; I'll come back to any gaps if time allows." Name the diagram as "committed as-is", pick a component to deep-dive (or ask the interviewer), and allocate no more than 15 minutes to phase 3. Protect the last 3-5 minutes for wrap-up regardless. Running over phase 2 into phase 3 is a known failure mode; cutting your losses is the senior move.
Question 15: Design Doc Minimum
Name three sections of a design doc that, if missing, cause a senior reviewer to reject the doc.
Answer: (1) Requirements with numbers (non-functional targets quantified). (2) Scale and failure analysis (10×, AZ outage, per-box failure walk). (3) Trade-offs (explicit decisions with rejected alternatives and costs named). Losing any of the three indicates the author did not understand the shape of the problem.
Interleaved Review Questions
Prior Module Question 1 (S7 Architecture & DDD)
Why is a bounded context a design tool rather than just a naming convention?
Answer: A bounded context defines where a set of domain terms has a single, consistent meaning and where transactional integrity holds. Crossing it requires explicit translation (contracts, anti-corruption layers). Without bounded contexts, shared models leak coupling and force unrelated teams to agree on vocabulary.
Prior Module Question 2 (S7 API Design)
What is the difference between an API's contract and its implementation, and which one are consumers coupled to?
Answer: The contract is what consumers can observe (request/response shapes, status codes, idempotency, ordering guarantees); the implementation is how the service achieves it. Consumers are coupled to the contract. Changing implementation freely is safe; changing the contract silently breaks consumers.
Prior Module Question 3 (S6 Distributed Systems)
State the CAP theorem in one sentence and name which property modern systems typically preserve during a network partition.
Answer: CAP says a distributed system can only provide two of consistency, availability, and partition tolerance at a time; during a real network partition (which is unavoidable), the choice is between consistency and availability. Most user-facing systems choose availability and recover consistency afterward; systems handling money or identity typically choose consistency and accept unavailability.
Prior Module Question 4 (S6 Databases)
Why is a secondary index on a write-heavy table expensive, and what is the typical diagnostic when indexes are overused?
Answer: Each secondary index must be updated on every insert/update/delete of the indexed columns, turning one write into N+1 writes; it also occupies storage and buffer-pool space. The diagnostic is rising write latency and IO utilization correlated with recent index additions.
Prior Module Question 5 (S7 Architecture)
What is the difference between a quality attribute (architecture characteristic) and a functional requirement, and why do architects pay more attention to the former?
Answer: A functional requirement defines what the system does; a quality attribute (scalability, availability, latency, security) defines how well. Two very different architectures can both meet the same functional requirements; the quality attributes are what separate them. Architects choose shape primarily on quality attributes because they are what the architecture is trying to deliver.
Self-Assessment and Remediation
Mastery Level (90-100% correct):
- Ready to advance to Module 2. You have the methodology; the remaining modules layer specialization on top.
Proficient Level (75-89% correct):
- Review only the missed concept pages and redo the associated mini drills.
- Particularly re-run any missed estimation question with pencil and paper until fluent.
Developing Level (60-74% correct):
- Rework practice pages 1 and 3 (estimation/framing, stress test).
- Run at least two of the four design katas again with a timer.
- Revisit Cluster 3 and Cluster 4 concepts with attention to the mini drills.
Insufficient Level (<60% correct):
- Return to the concept sequence and rebuild the method in order.
- You likely cannot yet estimate, deep-dive, or stress-test a new system without reference; this is the first thing to fix.
- Before advancing, pass a timed 45-minute mock with a peer where you hit all four phases.