Module 2: Microservices & Service Decomposition

Primary texts: Building Microservices (Sam Newman, 2nd ed.) and Microservices Patterns (Chris Richardson) Selective support: Fundamentals of Software Architecture (Richards & Ford) microservices chapter, System Design Primer application-layer and communication chunks, Team Topologies (Skelton & Pais) for organization alignment

This guide is the primary teacher. You do not need to read the source books front-to-back. You do need to leave this module able to (a) decide when microservices are the right tool, (b) draw defensible service boundaries on a real system, (c) own data per service without a shared database, and (d) operate the resulting distributed system with contracts, resilience patterns, and tracing.

Scope of This Module

Microservices is not "many small services" and it is not "containerize the monolith." It is a specific architectural style whose value depends entirely on organizational and operational maturity. This module is about learning when to pay that cost and how to decompose a system without creating a distributed monolith.

What it covers in depth:

the microservices distillation: independent deployability, team ownership, and bounded business scope
the cost model: what you must have before the style pays back
monolith-first defaults and the strangler-fig migration pattern
bounded contexts as service boundaries and decomposition heuristics by business capability, subdomain, and noun/verb
anti-patterns: distributed monolith, entity services, shared database, "CRUD-a-service"
database-per-service and the data-ownership discipline
service contracts (sync APIs, async events, tolerant readers) and consumer-driven contract testing
synchronous vs asynchronous communication and when each is right
service discovery, API gateways, and the BFF (Backend-for-Frontend) pattern
resilience primitives: timeouts, retries, circuit breakers, and bulkheads
distributed tracing, correlation IDs, and observability for request flow
deployment independence: versioning, backward compatibility, zero-downtime
Conway's Law and stream-aligned teams

What it deliberately does not cover here:

saga orchestration and distributed transaction mechanics (S8 M3 event-driven architecture)
service mesh, Kubernetes operators, and runtime platform details (production path, later)
DDD strategic and tactical mechanics as a separate course (already built in S7 M3)
full SRE/SLO math (S8 M4 scale, reliability, performance)

Before You Start

Answer closed-book:

In one sentence, what makes a service a "microservice" rather than just a service?
Give one concrete reason a three-engineer team should not build microservices yet.
What is the difference between a bounded context and a service?
Why is "two services sharing the same database" often a warning sign?
If service A calls service B synchronously and B is slow, what happens to A without a timeout?

Diagnostic Interpretation

4-5 solid answers -- Ready for the full path. 2-3 solid answers -- Continue, but expect extra time in Clusters 2 and 3. 0-1 solid answers -- Revisit S7 M2 (architecture patterns) and S7 M3 (bounded contexts) briefly. Microservices is a deployment style of ideas that live there first.

What This Module Is For

You have probably seen one of two failure patterns already:

a monolith that has become a "big ball of mud" and cannot be safely changed
a set of "microservices" that must all be deployed together and share one database

This module teaches you to avoid both. After it, you should be able to answer:

should this system be microservices at all, or a modular monolith?
where should the service seams go, and what data does each side own?
what contract do the services expose, and how do we change it without breaking consumers?
how do we communicate between services without turning a slow dependency into a cascading outage?
how do we tell, in production, which service is responsible for a slow or failing request?

This module feeds directly into:

S8 M3 Event-Driven Architecture -- the async side of service contracts
S8 M4 Scale, Reliability & Performance -- the SLO/SLI view of the resilience primitives
S8 M5 Technical Leadership & Strategy -- the ADR for "monolith vs microservices" is one of the most common architecture decisions you will write

Concept Map

How To Use This Module

Work in order. The later clusters only make sense once the decomposition and data-ownership reflexes are stable.

Cluster 1: When and Why Microservices

Order	Concept	Type	Focus
1	The Microservices Distillation: Independent Deploy, Team-Owned, Bounded	PRIMARY	What actually defines the style vs what is just "services"
2	Why Not Microservices: The Cost Model	PRIMARY	The capabilities you must already have before it pays back
3	Monolith-First and Strangler-Fig Migration	PRIMARY	Default posture and the single migration pattern worth memorizing

Cluster mastery check: Can you say, for a specific team and product, whether microservices is the right style and why?

Cluster 2: Finding Service Boundaries

Order	Concept	Type	Focus
4	Bounded Contexts as Service Boundaries	PRIMARY	The DDD concept that actually carries across to deployment units
5	Decomposition Heuristics: Capability, Subdomain, Noun/Verb	PRIMARY	Three lenses, one answer -- and what to do when they disagree
6	Avoiding Distributed-Monolith and Entity-Service Anti-Patterns	PRIMARY	How otherwise-reasonable decompositions fail

Cluster mastery check: Can you split a given monolith into three named services and justify the data each one owns?

Cluster 3: Data Ownership and Contracts

Order	Concept	Type	Focus
7	Database-Per-Service and the Shared-DB Temptation	PRIMARY	Why shared databases silently re-couple services
8	Service Contracts: Sync APIs, Async Events, Tolerant Readers	PRIMARY	The three contract shapes and the tolerance discipline
9	Consumer-Driven Contract Testing	PRIMARY	Catching breaking changes before the deploy

Cluster mastery check: Can you write both an OpenAPI fragment and an event schema for the same cross-service interaction?

Cluster 4: Service Communication

Order	Concept	Type	Focus
10	Synchronous REST/gRPC vs Asynchronous Events	PRIMARY	Picking the right style per interaction, not per system
11	Service Discovery, API Gateways, BFF Pattern	PRIMARY	Edge vs internal routing and client-shaped aggregations
12	Resilience: Timeouts, Retries, Circuit Breakers, Bulkheads	PRIMARY	Stopping one slow dependency from taking down the cluster

Cluster mastery check: Can you draw the timeline of a call from frontend to a failing downstream service with every resilience primitive in place?

Cluster 5: Operating Microservices

Order	Concept	Type	Focus
13	Distributed Tracing and Correlation IDs	PRIMARY	Reconstructing one request across many services
14	Deployment Independence: Versioning and Backward Compatibility	PRIMARY	Why independent deploy is a property of contracts, not of pipelines
15	Team Topology: Conway's Law and Stream-Aligned Teams	SUPPORTING	The organizational shape that makes all of the above actually work

Cluster mastery check: Can you point at a trace from production and explain which service owns the p99 latency and which team would be paged?

Then work these practice pages:

Order	Practice path	Focus
1	Decomposition Lab	Turning a monolith into named services with explicit owned data
2	Contracts and Data Workshop	Writing an OpenAPI fragment and an event schema for the same flow
3	Resilience Clinic	Timeouts, retries, circuit breakers, and bulkheads on one call path
4	Microservices Katas	Decompose e-commerce, write a contract test, wire a circuit breaker, model a distributed trace

Use Module Quiz after the concept and practice path. Use Reference and Selective Reading and Learning Resources only for targeted reinforcement.

Learning Objectives

By the end of this module you should be able to:

Propose and defend whether a given product should use microservices, a modular monolith, or a hybrid.
Decompose a monolith into 3-6 named services with explicit owned data and a migration sequence.
Identify and remove distributed-monolith and entity-service anti-patterns in a proposed design.
Design database-per-service ownership and explain how cross-service data is obtained (composition vs replication vs events).
Write a service contract as an OpenAPI fragment and an equivalent event schema, including compatibility rules.
Explain consumer-driven contract testing and run a thought-experiment contract test for a named interaction.
Choose synchronous vs asynchronous per interaction and justify the choice.
Apply timeouts, retries (with backoff + jitter), circuit breakers, and bulkheads to a specific call path and trace what happens under a downstream outage.
Read and interpret a distributed trace, including correlation IDs, span hierarchy, and where latency is concentrated.
Describe a team topology that preserves deployment independence and explain Conway's Law in one paragraph.

Outputs

one decomposition memo for a named monolith (3-6 services, owned data table, first strangler cut)
one OpenAPI fragment and one matching event schema for a single interaction
one consumer-driven contract test specification (provider, consumer, and one example)
one resilience sequence diagram (mermaid) for a specific call path
one mock distributed-trace waterfall with correlation IDs and p99 narration
one ADR-style 1-page memo arguing "microservices" or "modular monolith" for a real or chosen system
one mistake log with at least 12 tagged anti-patterns (shared db, entity service, synchronous fan-out, retry without backoff, timeouts absent, contract break, lockstep deploy, etc.)
a set of flashcards covering the 15 concepts

Completion Standard

You have completed this module when all of these are true:

you can decide when not to use microservices with concrete evidence, not aesthetics
you can draw service seams from bounded contexts and business capability, and explain where they disagree
you can refuse a shared database and explain the alternatives in sentences, not buzzwords
you can write a contract and a compatibility rule without looking it up
you can describe the full path of a request from edge to downstream failure with every resilience primitive in place
you can read a trace and say what the owning team should change

If the answer sounds familiar but you cannot name the owned data and the failure mode, the module is not complete.

Reading Policy

Concept pages are the main path.
Local book chunks are selective reinforcement, not a second syllabus.
Read this only if stuck means try the concept page, self-check, and drill first.
Optional deep dive means nuance, not required progression.
For DDD framing (subdomain, ubiquitous language, aggregates), lean on Semester 7, Module 3 -- this module only restates what it needs and adds the deployment side.

Suggested Weekly Flow

Day	Work
1	Concepts 1-3, write the cost-model memo for a chosen product
2	Concepts 4-6, draft a decomposition of an e-commerce monolith
3	Concepts 7-9, write an OpenAPI fragment and matching event schema
4	Concepts 10-12, draw the resilience sequence diagram
5	Concepts 13-15, sketch a distributed trace and a team topology
6	Practice pages 1-2 and targeted book chunk reinforcement
7	Practice pages 3-4, quiz, and mistake-log cleanup

Reference

If you need exact links into the local chunked books, use Reference and Selective Reading.

Rich Learning Pages

Scope of This Module​

Before You Start​

Diagnostic Interpretation​

What This Module Is For​

Concept Map​

How To Use This Module​

Cluster 1: When and Why Microservices​

Cluster 2: Finding Service Boundaries​

Cluster 3: Data Ownership and Contracts​

Cluster 4: Service Communication​

Cluster 5: Operating Microservices​

Learning Objectives​

Outputs​

Completion Standard​

Reading Policy​

Suggested Weekly Flow​

Reference​

Rich Learning Pages​