Skip to main content

Why Not Microservices: The Cost Model

What This Concept Is

Microservices do not give you independent deployability for free. They move complexity from inside-the-process to across-the-network, from compile-time to runtime, and from one codebase to many operational surfaces. The cost model says: the style only pays back if you already have the organizational and operational capabilities to absorb that shift.

The prerequisites worth checking before you adopt the style:

  • automated build, test, and deploy for each service (each service = its own pipeline)
  • production observability: metrics, logs, distributed traces
  • a platform for service discovery, secrets, and configuration per environment
  • a culture of contracts and backward compatibility between teams
  • an on-call practice that can handle distributed failure modes
  • a team size large enough to own at least 2-3 services each

Without most of these, the style imposes overhead that will dominate any benefit.

Why It Matters Here

Teams adopt microservices because of trend pressure, hiring pitches, or a reading of Netflix and Amazon talks. Then two quarters later, feature velocity is lower than the monolith they replaced, ops cost is higher, and the team is in permanent incident-response. The cost model is the filter that would have stopped that.

Concrete Example

Two realistic 10-engineer teams, same product:

  • Team A builds a modular monolith. One repo, one deploy, three clear modules behind tight interfaces, one database with three schemas. They ship 10-20 times a day.
  • Team B builds microservices: 9 services, 9 pipelines, 9 on-call rotations, a shared Kubernetes cluster, no distributed tracing, ad-hoc contracts. They ship 2-3 times a week and spend 30% of calendar time on platform and coordination work.

Team B may eventually be faster if they invest in the prerequisites. Without that investment, Team A stays ahead for years.

Common Confusion / Misconception

"Microservices scale better." Performance scale is mostly orthogonal to architectural style at normal product scale. Monoliths horizontally scale fine up to very large loads (Shopify, Stack Overflow, GitHub have run monoliths at huge scale). What microservices scale is organizational -- the number of teams that can work independently without stepping on each other.

Second confusion: "we are too small, but we will grow into it." You grow into capabilities, not into style. A 4-person team with 12 microservices will still have 12 microservices when the team is 6, and now each person is on 2 on-call rotations.

How To Use It

Before proposing microservices, answer these questions in writing:

  1. Independent deploy test. Do you already have a pipeline that could deploy 5+ services independently today?
  2. Observability test. If a request is slow, can you say which service caused it in under 2 minutes?
  3. Contract test. Have you ever run a consumer-driven contract test? Do your teams write OpenAPI or schema definitions?
  4. Team test. Do you have at least one clearly-separated team per proposed service, each with on-call capacity?
  5. Fallback test. If you split and it does not work, do you have a migration path back to a monolith?

If you cannot answer yes to 3 or more, you are not ready. Build the capability first, or stay with a modular monolith and revisit.

Check Yourself

  1. Why is "scales better" not automatically an argument for microservices?
  2. Name three operational capabilities without which microservices tend to make things slower, not faster.
  3. Why does a 4-person team with 12 microservices tend to fail in a specific on-call way?

Mini Drill or Application

For a real system you know, score it 0-2 on each of the five readiness tests above. Total /10.

  • 0-4: modular monolith, plain and simple
  • 5-7: carefully chosen split (one or two services extracted)
  • 8-10: ready for broader decomposition

Write one sentence for each score. The habit you are building: refuse to call the decision "inevitable".

How This Sits In The Module

The next concept (monolith-first + strangler fig) gives you the specific pattern for moving from a monolith to microservices after the cost model says yes. Cluster 4 resilience and cluster 5 deployment independence are the operational capabilities the cost model is asking for.

Read This Only If Stuck

Local chunks

External canonical references

Depth Path

  • Compare the "prerequisites" list above to your team's actual capabilities. This list grows one line per production incident. That is fine; just keep it honest.
  • Shopify engineering blog's "Deconstructing the monolith" and Stack Overflow's "Stack Overflow: The Architecture -- 2016 Edition" are concrete case studies of successful modular monoliths at scale. Both are prerequisites before you argue "we must go microservices to scale."

Transfer: When The Cost Model Says "Partial"

Most real teams score 5-7 on the readiness worksheet. The advice is not to stay monolith forever or to go full microservices; it is to extract one or two services where the value is clearly worth the cost, keeping the rest as a modular monolith. This is the strangler-fig pattern in concept 03 applied sparingly. The worst outcome is declaring an org-wide microservices migration while scoring 4/10; the second worst is refusing any extraction while scoring 8/10. Calibrate the scope to the score.

A useful heuristic from S7 M5 concept 02 (reversibility): treat "go microservices org-wide" as a one-way door; treat "extract the notification sender" as a two-way door. Decide their ceremony accordingly.

Failure Modes When the Cost Model Is Ignored

Specific patterns that appear when teams skip the cost model:

  • Ops weight dominates feature work. Team spends more time on platform plumbing (discovery, secrets, pipelines) than on the product.
  • Debugging time increases. A single slow request now requires correlating logs across 5-10 services, but the tracing stack is not there, so it falls back to grep and timestamp-matching.
  • Release cadence drops. Despite separate pipelines, every meaningful change requires "just a small coordinated deploy" with another service.
  • On-call fatigue. A small team now holds 5-10 on-call rotations. Pages come at night. People leave. Institutional memory exits.
  • The monolith comes back as a distributed monolith. Shared libraries, shared DBs, or lockstep contracts recreate the original coupling -- now with extra network latency.

If you see three or more of these within 6 months of a migration, the cost model was skipped. Stop adding services and invest in the capability gaps.

Scoring Worksheet

For a real or proposed decomposition, fill this in:

Readiness area0 (absent)1 (partial)2 (strong)Your score
Automated per-service CI/CD------?
Distributed tracing------?
Service discovery + config platform------?
Contract culture (CDC, OpenAPI, schema registry)------?
On-call maturity------?

Total out of 10. The rule of thumb from above:

  • 0-4: modular monolith
  • 5-7: one or two careful splits
  • 8-10: broader decomposition

Revisit this every quarter. The scoring is supposed to change.

A Less-Noticed Cost: Cognitive Load

Per-engineer cognitive load is the limiting factor most teams forget. Every service an engineer touches adds:

  • a repo, a build tool version, a dependency graph
  • a local dev setup (docker-compose file, seed data)
  • a deployment pipeline, a rollback procedure
  • a set of runbooks and alert rules

Team Topologies (covered in concept 15) frames this explicitly: a team should own enough services to have slack for change, not so many that every service rusts. Usually 2-4 per team is the range that works.