Skip to main content

Decompose Each Component

What This Concept Is

Decomposition is zooming in on one box from the high-level diagram and answering five questions about it in five minutes or less:

  1. Inputs: what messages, requests, or events does this box accept? From whom?
  2. Outputs: what does it produce? For whom? Sync response, async event, durable write?
  3. Algorithm: what does it actually do? One paragraph or one mermaid diagram.
  4. State: what does it own? In-memory only, per-instance? External store? Shared?
  5. Ownership and SLO: which team owns it? What latency/availability target does it publish to callers?

The deep-dive conversation is always: the interviewer points at a box and says "go deeper there". The method above is what you run.

A complementary Fundamentals-style discipline: ask, of every box, which form of coupling does it impose on its callers? Name-coupling (they must know its name), type-coupling (they must know its message types), timing-coupling (they must call it synchronously), ordering-coupling (they must call it in a specific order). A high-quality decomposition minimizes the strongest forms of coupling -- specifically, trying to eliminate timing and ordering coupling by moving to event-driven flows where the problem allows. This is the connascence framing from Fundamentals of Software Architecture: you are picking where the inevitable connascence lives, not whether it exists.

Why It Matters Here

Without a repeatable decomposition move, deep dives devolve into "tell me everything you know about caches". With it, you stay focused on the one component, and you leave the others intact.

  • Cluster 4 stress tests walk component by component; they require each to be decomposed.
  • Cluster 5 trade-offs at the deep-dive level need the inputs/outputs crisply stated to be defensible.
  • Future modules (S8M2 microservices) reuse the same decomposition template.

Concrete Example: Fan-out Worker

Deep-dive on the Fan-out Worker box in a social feed design.

Inputs:

  • post.created event from Kafka, with {post_id, author_id, created_at, visibility}.
  • Retrieves the author's follower list (from follow-graph service).
  • Retrieves the author's fan-out strategy (materialize vs read-fanout) from a per-user config.

Outputs:

  • Writes one row per follower into the per-user timeline store (wide-column, partitioned by user_id).
  • Emits timeline.updated events for follower devices that have open push sessions.

Algorithm:

Key decisions inside the algorithm:

  • Follower list pagination with stable keys, so one celebrity post does not block on a single 50 M follower scan.
  • Batch size 500 balances per-write cost (wide-column amortization) against per-request latency.
  • Celebrity detection threshold (e.g., >100 K followers) routes to read-fanout; see the trade-off log.

State:

  • Stateless per instance. A Kafka consumer group coordinates partitioning.
  • Per-partition offset is externalized (Kafka offsets).

Ownership and SLO:

  • Owned by the Feed team.
  • SLO: 95th-percentile post-to-timeline latency under 30 seconds for the normal path.
  • SLO on celebrity path is a separate line ("post visible in author timeline within 1 s; follower visibility via read-fanout").

Now every later question about this box ("what if the follow graph is slow?", "what if the post has 10 M followers?", "what if a batch write fails midway?") has a specific place to land.

Concrete Example 2: Redirect Service (URL Shortener)

Inputs: GET /:code from the LB; optional User-Agent, geo-hint headers for analytics.

Outputs: 301 redirect to the long URL (hot path, must be fast); async log event to the analytics topic.

Algorithm:

State: stateless; per-process LRU cache (bounded) as a warm L1 in front of Redis.

Ownership and SLO: owned by the Redirect team; P99 < 30 ms per instance; error rate < 0.01%. At cache-hit rates > 98% this is trivially achieved; at < 90% it fails.

Coupling analysis: the box has name- and type-coupling to the DB (it speaks its schema) and to the analytics topic (it speaks its event format). It has no timing-coupling to analytics (async). A change to the analytics event format does not block this service; a change to the DB schema does. This is exactly the ranking we want -- hot-path tightly coupled, cold-path loosely coupled.

This template scales to every box on the diagram. The key insight: the same five questions + coupling analysis produce comparable designs across very different components, which is what makes this move a methodology rather than a per-box improvisation.

Common Confusion / Misconceptions

"Decomposition is implementation." No. Decomposition is contract-level. You are specifying what the box is, not how it is coded. You do not write classes; you write inputs/outputs/algorithm/state.

"I will describe all the boxes at once." The deep dive is one box at a time. The interviewer picks; you deliver; then they pick the next. Spraying across all boxes is how you run out of time in phase 3.

"Algorithm equals code." Algorithm at this level is a pseudocode-or-diagram sketch of the decision flow: what happens when the input arrives, in what order, with what fan-out, with what backpressure.

"Ownership is a management detail." Ownership surfaces coupling: if two teams own one box, that is a design smell. If one team owns eight boxes, that is a decomposition smell.

"Decomposition is only about what a box does." It is also about what it refuses to do. A well-decomposed box has clear non-responsibilities. "Fan-out Worker does not do ranking; ranking is a separate component fed from the timeline store" is a decomposition statement that saves hours of later argument.

How To Use It

Five-minute deep dive per box. Walk through the five-question template out loud, on the board, in order.

If the reviewer asks a question you cannot place in one of the five boxes of the template, you either do not understand the component yet or you are wandering outside its boundary. Either way, state that and redirect.

Transfer / Where This Shows Up Later

  • Cluster 4 (stress test) assumes each box can be described at this contract level; without it, failure walks are vague.
  • Cluster 5 concept 15 (design doc) has one subsection per box -- this template is literally what goes there.
  • S8M2 (microservices) uses this template for each service boundary decision (is this one service or two? the answer lives in the coupling analysis).
  • S8M3 (data patterns) decomposes data components (stores, caches, queues) with the same shape.
  • S9 (cloud) maps each decomposed box to a specific runtime (ECS service, Lambda, managed store); the SLO row becomes an alarm target.
  • S10 interviews: when the interviewer says "go deeper on X", this template is what "going deeper" looks like.

Check Yourself

  1. For the Redirect Service in a URL shortener, what are the inputs, outputs, algorithm, state, and SLO?
  2. For an ID Generator Service that produces 7-character short codes at 10 K QPS with no collisions, what is the algorithm, and where is the contention point?
  3. If the same team owns both "Write API" and "Read API" of the same service, is that two components or one?
  4. Identify one non-responsibility of the Fan-out Worker example above.

Mini Drill or Application

Pick three boxes from any high-level diagram you drew in the previous cluster. For each, deep-dive using the five-question template in under five minutes. Then:

  1. Identify the one question a reviewer is most likely to ask next.
  2. Prepare a one-sentence answer for it.
  3. Note any box whose state is implicit or unowned -- that is a redesign flag.

Read This Only If Stuck