Designing for Testability without Damaging the Domain
What This Concept Is
Testable code is code whose behavior you can exercise cheaply, in isolation, and at the level of granularity that matters. That is a desirable property. The failure mode is testability creep: contorting the production code specifically to make tests easier, until the code is worse as a domain model than it was before the tests.
Healthy testability uses three moves:
- Seams -- small, deliberate substitution points (dependency injection of the clock, filesystem, HTTP client).
- Pure cores / impure shells -- domain logic is a pure function over data; side effects sit in a thin shell that can be swapped in tests.
- Behavior, not implementation -- tests verify the domain's observable contract, so refactors do not require mass test rewrites.
Unhealthy testability uses:
- Test-only public methods (
setInternalStateForTests) - Leaking mocks into production (real code branching on whether a mock is present)
- Visibility-weakening for tests (making everything public; god-mode reflection in tests)
- Over-mocking -- every dependency mocked so the test pins down the implementation rather than the behavior
Why It Matters Here
Designing for testability is the one place where "this is hard to test" is a legitimate design signal. Hard-to-test code usually has hidden coupling or poor boundaries. But the response must not be "bolt on a backdoor"; it must be "fix the boundary." Otherwise you ship two things in one class: the domain logic and a test scaffold, and the domain loses.
Concrete Example
Bad (testability damage):
class OrderService:
def __init__(self):
self._clock = time.time
self._db = real_db
def confirm(self, order_id):
now = self._clock()
...
# added only to support tests
def set_clock_for_tests(self, fake_clock):
self._clock = fake_clock
# added only to support tests
def _DEBUG_peek_internal_state(self):
return self._db._raw_rows
This class carries testing scaffolding into production. Anyone reading OrderService has to guess which methods are "real."
Better (seam + pure core):
class Clock:
def now(self) -> datetime: ...
class SystemClock(Clock):
def now(self): return datetime.now(tz=UTC)
class OrderService:
def __init__(self, clock: Clock, repo: OrderRepository):
self._clock = clock
self._repo = repo
def confirm(self, order: Order) -> Order:
# pure domain function, easy to test without a clock mock at all:
return order.confirmed_at(self._clock.now())
class Order:
def confirmed_at(self, now: datetime) -> "Order":
# pure function over data
return replace(self, status="confirmed", confirmed_at=now)
Tests for Order.confirmed_at need zero mocks -- they pass a datetime. Tests for OrderService.confirm use a trivial fake clock. No test-only public surface on production classes.
Common Confusion / Misconception
"Mock everything to isolate." Mocking every collaborator produces tests that pass when the code is broken and fail when the code is refactored. Mock at boundaries you do not own (network, filesystem, time). Prefer real objects for in-memory domain code.
"Make it public so the test can reach it." Almost always the wrong move. Test the behavior through the public API; if it is awkward, the API is probably missing a method that production code will also want.
"Untestable code must be wrong." Not always. Some code is simply a thin composition of others (adapter layers). A handful of integration tests at the top is better than forcing every adapter to be unit-tested through hoop-jumping.
"100% coverage is the goal." Coverage is a weak signal. A high-coverage test suite can still miss every important property of the domain; a leaner, behavior-focused suite can catch more bugs.
"Tests are separate from design." They are not. Hard-to-test code is a design smell. The response is to redesign the boundary, not to weaken production encapsulation.
How To Use It
When something is hard to test, diagnose before hacking:
- Is there a hidden dependency (time, random, filesystem, network, env var)? Inject it.
- Is the logic entangled with I/O? Split into a pure function plus a thin shell.
- Is the unit too big? Extract a smaller class with a clearer responsibility.
- Does the test want to peek at internal state? The behavior you want to assert is probably missing from the public API; add it for production, not for tests.
Two small rules:
- No method exists solely for tests.
- Mocks only for things you do not own.
Check Yourself
- Why is a
set_clock_for_testsmethod a design smell? - Give an example of an injectable seam that is good design independently of tests.
- When is over-mocking worse than no mocking?
Mini Drill or Application
For each code smell, diagnose the testability issue and propose a refactor that does not damage production:
- A class that reads the current time by calling
System.currentTimeMillis()directly. - A function that both computes a discount and writes to the database.
- A service where every method is
publicbecause "the tests needed them." - A test file that mocks twelve collaborators to call one method.
Read This Only If Stuck
- Good Code Bad Code: Make code testable and test it properly
- Good Code Bad Code: Design with dependency injection in mind
- Good Code Bad Code: Avoid making things visible just for testing
- Good Code Bad Code: Mocks and stubs can be problematic
- Clean Code: Unit tests -- keeping tests clean
- Clean Code: F.I.R.S.T.