Module 4: API Design & Contract Evolution: Case Studies

These case studies treat APIs as architecture. Every endpoint, event, schema field, error shape, pagination token, and deprecation notice is a contract somebody else's code may depend on.

How To Use These Case Studies

Identify the consumer and the contract they rely on.
Separate implementation change from contract change.
Name the compatibility rule.
Produce the required artifact.
Decide how the contract will be tested and deprecated.

Case Study 1: Idempotent Payment Creation

Scenario: A mobile app calls POST /payments and times out. The user taps again. Without an idempotency contract, the server may create two payments.

Source anchor: Stripe's API docs describe idempotency keys as unique client-generated keys used to recognize retries and return the same result for repeated requests. Stripe's idempotency docs for Stripe API idempotent requests official docs idempotency keys.

Module concepts:

idempotency
POST side effects
retry safety
request identity
timeout ambiguity

Wrong Approach

"POST is not idempotent, so clients should not retry."

Networks fail. Clients will retry. The API should make safe retry possible for operations where duplicate side effects are unacceptable.

Better Approach

Define an idempotency contract:

POST /payments
Idempotency-Key: 8f6d...

Server rule:

same key + same request body -> return original result
same key + different request body -> reject
key retention -> documented window

Tradeoff Table

Choice	Gain	Cost
no retry	avoids duplicates	poor reliability
client retry without key	hides transient faults	duplicate side effects
idempotency key	safe retry	key store and body-hash logic
natural resource ID via PUT	idempotent by URI	not always a good domain fit

Failure Mode

Timeout ambiguity turns into duplicate payments, orders, emails, or reservations.

Required Artifact

Write an idempotency contract: key scope, retention, body comparison, response replay, and conflict response.

Project / Capstone Connection

Every capstone API with side effects should include retry-safe idempotency rules.

Case Study 2: Cursor Pagination Instead Of Page Numbers

Scenario: GET /orders?page=3&size=50 works in staging. In production, new orders arrive while a client paginates, causing duplicate or skipped rows.

Source anchor: Microsoft REST API guidance discusses paging large collections and API design consistency. Microsoft Learn REST API best practices for Microsoft REST API Guidelines official pagination versioning idempotency.

Module concepts:

pagination
stable ordering
cursor
consistency window
collection contract

Wrong Approach

"Page number plus limit is enough."

Offset pagination is easy but unstable when the collection changes during traversal.

Better Approach

Use cursor pagination with a stable sort:

GET /orders?limit=50&after=eyJjcmVhdGVkX2F0Ijoi...

Contract:

order by created_at desc, id desc
cursor encodes last tuple
new writes may appear before first page
client follows next_cursor until null

Tradeoff Table

Choice	Gain	Cost
offset/page	simple UX	skips/duplicates under writes
cursor	stable traversal	opaque token and sort discipline
snapshot token	strongest consistency	server-side state or timestamp semantics
search-after	good for large datasets	cannot jump to arbitrary page

Failure Mode

Clients build reconciliation logic because the API cannot provide stable traversal.

Required Artifact

Write a pagination contract with sort keys, cursor format, mutation behavior, and invalid-cursor response.

Project / Capstone Connection

Capstone list APIs should avoid offset pagination for high-change collections.

Case Study 3: GraphQL Solves Overfetching And Reintroduces N+1

Scenario: A GraphQL endpoint lets clients request exactly the fields they need. A query asks for 50 issues, each issue's author, labels, and latest comment. Resolvers independently query the database, producing N+1 behavior.

Source anchor: GraphQL DataLoader describes batching and caching per request and notes that GraphQL field resolvers can easily create inefficient loading without a batching mechanism. the DataLoader repository for GraphQL N+1 DataLoader official docs batching caching.

Module concepts:

GraphQL
resolver boundary
N+1
batching
contract vs implementation

Wrong Approach

"GraphQL is more efficient because clients select fields."

Field selection reduces overfetching, not necessarily backend work.

Better Approach

Design resolver loading:

Request:
  issues -> author -> labels -> latestComment

Loader plan:
  batch authors by user_id
  batch labels by issue_id
  batch latest comments by issue_id
  cache within request

Tradeoff Table

Choice	Gain	Cost
REST fixed shape	predictable backend query	possible overfetching
GraphQL naive resolvers	client flexibility	N+1 and cost unpredictability
DataLoader batching	efficient field resolution	per-request cache design
persisted queries/cost limits	operational control	governance overhead

Failure Mode

The API contract looks elegant while the database receives hundreds of hidden queries.

Required Artifact

Write a resolver query plan and a query-count test for one nested GraphQL request.

Project / Capstone Connection

If a capstone uses GraphQL, include resolver batching and cost controls.

Case Study 4: Breaking Change Hidden As Cleanup

Scenario: An API response includes customer_name. The server team renames it to display_name because the domain language improved. Mobile clients crash after deployment.

Source anchor: Google API Improvement Proposals document resource-oriented API design and compatibility practices for evolving APIs. See Google AIP-180 backward compatibility.

Module concepts:

backward compatibility
additive change
field removal
deprecation
consumer contract

Wrong Approach

"The new name is better, so replace the old field."

Better implementation language does not erase existing consumer contracts.

Better Approach

Evolve additively:

{
  "customer_name": "Amina Khan",
  "display_name": "Amina Khan"
}

Deprecate with evidence:

announce
measure usage
provide migration guide
wait through support window
remove only after consumers are gone

Tradeoff Table

Choice	Gain	Cost
rename in place	clean schema	breaks consumers
add new field	backward compatible	duplicate fields during transition
versioned endpoint	clean break	longer support burden
consumer-driven contract tests	catches breakage	needs consumer participation

Failure Mode

An internal refactor becomes a public incident.

Required Artifact

Write a deprecation plan with announcement, telemetry, migration examples, support window, and removal gate.

Project / Capstone Connection

Capstone APIs should include compatibility rules for response fields.

Case Study 5: REST, gRPC, Or Events For A Fulfillment Integration

Scenario: Checkout must tell fulfillment about paid orders. One team wants REST, one wants gRPC, and one wants events. Each is reasonable under different forces.

Source anchor: Google Cloud's API design and gRPC documentation describe resource-oriented APIs and RPC contracts; AsyncAPI documents event-driven contracts for asynchronous systems. See Google API design guide, gRPC concepts, and AsyncAPI specification.

Module concepts:

REST
gRPC
event contract
synchronous vs asynchronous integration
consumer ownership

Wrong Approach

"Pick the integration style the team likes."

The style should match latency, ownership, reliability, and coupling.

Better Approach

Compare operation semantics:

Need immediate answer from fulfillment?
  use synchronous API

Need notify multiple subscribers after checkout?
  publish event after transaction commits

Need low-latency internal RPC with strong schema?
  gRPC may fit

Tradeoff Table

Choice	Gain	Cost
REST	broad compatibility	weaker schema/streaming semantics
gRPC	strong schema and efficient RPC	client/tooling constraints
event	loose temporal coupling	eventual consistency and replay rules
webhook	external async callback	delivery and verification burden

Failure Mode

Checkout waits synchronously for work that could be asynchronous, or publishes events for a workflow that actually needs immediate acceptance/rejection.

Required Artifact

Write an integration-style ADR comparing REST, gRPC, and events for one capstone workflow.

Project / Capstone Connection

Capstone integration choices should be justified by consumer needs and failure mode, not trend.

Source Map

Source	Use it for
Stripe idempotent requests	idempotency keys and safe retries
Microsoft REST API best practices	REST design, idempotency, collections
GraphQL DataLoader	batching and caching resolver loads
Google AIP-180	backward compatibility and breaking changes
Google API design guide	resource-oriented API design
gRPC concepts	RPC contracts and communication model
AsyncAPI specification	event-driven API contracts

Completion Standard

At least three artifacts are completed.
At least one API has an idempotency contract.
At least one list endpoint has a pagination contract.
At least one breaking-change plan includes telemetry and deprecation.
At least one integration-style ADR compares REST, gRPC, and events.

How To Use These Case Studies​

Case Study 1: Idempotent Payment Creation​

Wrong Approach​

Better Approach​

Tradeoff Table​

Failure Mode​

Required Artifact​

Project / Capstone Connection​

Case Study 2: Cursor Pagination Instead Of Page Numbers​

Wrong Approach​

Better Approach​

Tradeoff Table​

Failure Mode​

Required Artifact​

Project / Capstone Connection​

Case Study 3: GraphQL Solves Overfetching And Reintroduces N+1​

Wrong Approach​

Better Approach​

Tradeoff Table​

Failure Mode​

Required Artifact​

Project / Capstone Connection​

Case Study 4: Breaking Change Hidden As Cleanup​

Wrong Approach​

Better Approach​

Tradeoff Table​

Failure Mode​

Required Artifact​

Project / Capstone Connection​

Case Study 5: REST, gRPC, Or Events For A Fulfillment Integration​

Wrong Approach​

Better Approach​

Tradeoff Table​

Failure Mode​

Required Artifact​

Project / Capstone Connection​

Source Map​

Completion Standard​

How To Use These Case Studies

Case Study 1: Idempotent Payment Creation

Wrong Approach

Better Approach

Tradeoff Table

Failure Mode

Required Artifact

Project / Capstone Connection

Case Study 2: Cursor Pagination Instead Of Page Numbers

Wrong Approach

Better Approach

Tradeoff Table

Failure Mode

Required Artifact

Project / Capstone Connection

Case Study 3: GraphQL Solves Overfetching And Reintroduces N+1

Wrong Approach

Better Approach

Tradeoff Table

Failure Mode

Required Artifact

Project / Capstone Connection

Case Study 4: Breaking Change Hidden As Cleanup

Wrong Approach

Better Approach

Tradeoff Table

Failure Mode

Required Artifact

Project / Capstone Connection

Case Study 5: REST, gRPC, Or Events For A Fulfillment Integration

Wrong Approach

Better Approach

Tradeoff Table

Failure Mode

Required Artifact

Project / Capstone Connection

Source Map

Completion Standard