Fitness Functions That Keep the Architecture Honest

What This Concept Is

A fitness function is an executable check that an architectural claim still holds. If your design doc says "the reports module does not depend on the scan-core module," a fitness function is a test that fails when someone adds that import. If your design doc says "p95 latency on scan API stays under 500ms at 20 rps," a fitness function is a load test that asserts it. If your ADR says "persistence goes through the Repository only," a fitness function is an import-graph check that nothing else touches the DB driver.

Fitness functions live in the gap between aspirational architecture ("I believe in loose coupling") and enforced architecture ("this PR cannot merge until the coupling rule is satisfied"). They turn an ADR from a document into an executable standard. At capstone scale, the useful ones are short, cheap, run on every commit, and are linked directly to a top-3 characteristic (concept 07) or an ADR (concept 12).

Common shapes:

Static code checks (ArchUnit, dependency-cruiser, import-linter, ESLint no-restricted-imports): "module A must not depend on module B."
Metric thresholds: "p95 < 500 ms at 20 rps"; "error rate < 0.1%."
Contract tests: "API schema matches the consumer's expectations."
Deployment / operability gates: "curl /health returns 200 post-deploy."
Correctness invariants: "20 concurrent reserveSeat calls on a 1-seat tier produce exactly 1 success."

One per top-3 characteristic is a reasonable capstone target.

Why It Matters Here (In the Capstone)

Architecture that is only in a document rots. You will be changing code under deadline pressure in weeks 3-5. Without fitness functions, you will violate your own architecture on day 17 and not notice until the week-6 defense, when the grader reads the design doc, opens the code, and spots the contradiction.

The second reason: fitness functions compress the defense (concept 15). "Operability is a driver -- here is the health-check gate passing in CI. Simplicity is a driver -- here is the module-boundary test preventing circular imports. Correctness is a driver -- here is the no-oversell concurrency test." Three sentences, three green checks, and the grader has evidence rather than prose.

The third reason: they protect the scope-cut rules (concept 14). When week-5 pressure hits, fitness functions are the wall around the quality gates. "I can cut features; I cannot cut the test that proves the cap holds."

Concrete Example(s) -- from a real capstone

Example A -- inventory service (drivers: correctness, operability, offline availability):

Correctness -- idempotent scan merge:

// tests/arch/scan-idempotency.test.js
import { applyScan } from "../../src/scan-core/index.js";
import { expect, test } from "vitest";

test("same scan applied twice yields exactly one on-hand increment", async () => {
  const key = "scan-abc-123";
  const event = { sku: "SKU-1", qtyDelta: 5, idempotencyKey: key };
  await applyScan(event);
  const onHandBefore = await getOnHand("SKU-1");
  await applyScan(event);
  const onHandAfter = await getOnHand("SKU-1");
  expect(onHandAfter).toBe(onHandBefore);
});

Operability -- post-deploy health:

- name: Post-deploy health check
  run: curl --fail --max-time 5 https://staging.inventory.example/health

Modularity -- reports does not depend on scan-core internals:

test("reports module imports only public API of scan-core", () => {
  const imports = listImports("src/reports/");
  const violations = imports.filter(i => i.startsWith("src/scan-core/internal/"));
  expect(violations).toEqual([]);
});

Example B -- ticketing platform:

Correctness -- no oversell under concurrency (the capstone's soul):

test("100 concurrent reserveSeat calls on a 1-seat tier produce exactly 1 success", async () => {
  const tier = await createTier({ cap: 1 });
  const attempts = Array.from({ length: 100 }, () => reserveSeat(tier.id));
  const results = await Promise.allSettled(attempts);
  const successes = results.filter(r => r.status === "fulfilled" && r.value.ok);
  expect(successes.length).toBe(1);
});

Time-to-event-day -- end-to-end smoke in CI:

- name: E2E smoke
  run: npm run test:e2e -- --timeout 60s

Availability-during-window -- synthetic scan every 30s during the window:

A scheduled GitHub Action or cron runs a scan simulator during the event window and pages if it fails twice in a row. At capstone scale this is a 10-line script, not a Datadog stack.

Example C -- finance aggregator:

Idempotency -- re-import the same CSV yields zero new rows:

test("reimporting a file produces 0 new rows and 0 diffs", async () => {
  const first = await importCsv("fixtures/march.csv");
  const second = await importCsv("fixtures/march.csv");
  expect(second.newRows).toBe(0);
  expect(second.changedRows).toBe(0);
});

Privacy -- no outbound network in the import path:

A test that starts the import with a sandboxed process (no network egress allowed) and asserts it still completes. This is the "privacy" driver made executable.

Rules -- no uncategorized rows in the golden test fixture:

test("golden fixture classifies 100% of rows", async () => {
  await importCsv("fixtures/golden.csv");
  const uncategorized = await countUncategorized();
  expect(uncategorized).toBe(0);
});

Each of the three capstones gets exactly one fitness function per driver. Each under 30 lines. Each running in CI. This is what "architecture honesty" looks like at capstone scale.

Common Confusion / Misconceptions

"Fitness functions are just tests." They are tests, but they test architectural claims, not units of behavior. A unit test for a pure function is not a fitness function. A test that fails when someone imports across a bounded-context boundary is.
"I will write them later." You will not. Fitness functions you promise to write in week 4 are fitness functions that do not exist. The whole point is to write the check before the violation is possible.
"One fitness function is enough." Usually not. Each top-3 characteristic gets one -- otherwise you have a driver you cannot defend with evidence.
"Fitness functions are a microservices thing." False. They work at every scale. A module-boundary assertion in a monolith is a fitness function. An idempotency assertion on a local import pipeline is a fitness function.
"If the fitness function passes, the architecture is fine." Fitness functions cover what you knew to check. Drift in unmonitored dimensions is still possible; that is what the weekly review (concept 13) is for.

How To Use It (In Your Capstone)

Start from the top-3 characteristics (concept 07). Each gets one fitness function.
For each, write one sentence: "this characteristic holds if I can automatically check that ___."
Translate into the smallest possible check. A unit test, an integration test, a CI step, or a load-test script. Prefer the smallest.
Commit the check in the same PR as the ADR that states the claim. No orphan fitness functions.
Wire it into CI. A characteristic without a CI gate is an aspiration. If CI doesn't run your load test, schedule it weekly.
Make it fail once, deliberately. Violate the claim on purpose, watch the pipeline go red, revert. This is the only way to verify the signal works.
If it later starts failing, treat it like any test failure -- fix the code, or, if the architecture has genuinely evolved, update the ADR and the fitness function together.

Check Yourself

What does a fitness function have that a regular unit test does not?
For one of your top-3 characteristics, state the fitness function in one sentence. What does a failure mean?
What happens in your process if a fitness function starts failing mid-build?
Which of your fitness functions is the most expensive to run? Is it on every PR or on a schedule?
Name one architectural claim in your design doc that does not yet have a fitness function. When will it?

Mini Drill or Application (Capstone-scoped)

Drill 1 (30 min). Pick one of your top-3 characteristics. Write the fitness function for it. Commit to the capstone repo. Wire it into CI. Watch it pass.
Drill 2 (10 min). Make it fail once (deliberately violate the claim) and verify the signal works. Revert.
Drill 3 (30 min x 2 more, weeks 2 and 3). Repeat for the other two characteristics. By end of week 3, all three drivers are gated.
Drill 4 (10 min, each Friday). Open CI and verify all three fitness functions passed all week. If any skipped or was silently disabled, investigate.
Drill 5 (defense prep, 5 min). For each fitness function, write a one-sentence "here is how I check this characteristic" line for the defense. Memorize all three.

Source Backbone

Capstone design applies earlier architecture and domain material. These books are the source backbone for the decisions in this module.

Fundamentals of Software Architecture - architecture characteristics, styles, and tradeoffs.
Learning Domain-Driven Design - domain discovery, subdomains, and bounded contexts.
Clean Architecture - dependency direction and boundary discipline.
API Design Patterns - contract and API decision support.

What This Concept Is​

Why It Matters Here (In the Capstone)​

Concrete Example(s) -- from a real capstone​

Common Confusion / Misconceptions​

How To Use It (In Your Capstone)​

See also (integrative)​

Check Yourself​

Mini Drill or Application (Capstone-scoped)​

Source Backbone​