Secrets and Config Without Sprawl
What This Concept Is
Secrets and configuration are the values your code needs at runtime that are not in the source tree: database URLs, API keys, third-party webhook secrets, feature-flag tokens, signing keys. "Without sprawl" means they live in one system per environment, and your app reads them through one typed entry point.
The minimum system for a capstone:
- one secret store per environment (cloud secret manager, HashiCorp Vault, Doppler, 1Password, or Kubernetes External Secrets) -- never
.envcommitted files, never values in Terraform variables - one config module in your app that reads from environment variables at startup
- zero values checked into git, including
.env.examplewith real placeholders - one typed schema (Zod, Pydantic, Go struct tags,
envparse) validating at boot; missing or malformed values cause the process to crash immediately, not serve 500s later
Two related axes: secret vs config (secret = rotation event if leaked; config = merely inconvenient) and per-env vs global (per-env = different value in each env; global = same everywhere). A JWT signing key is secret and per-env. A log level is config and per-env. A product-name string is config and global.
Why It Matters Here (In the Capstone)
Secret sprawl is where capstones silently lose grades. Half the .env values in git, half in the cloud, one in a dotfile on your laptop -- and deployment becomes "works on my machine" forever. Reviewers look at this directly: they try to deploy on a fresh laptop and see what fails first.
There is also a security dimension the reviewer grades on sight: if the repo contains any real-looking token, it is an automatic red flag. Public repos are scraped continuously by bots; any key checked in is considered compromised within minutes. Treat every committed secret as a revocation event, not an embarrassment.
Concrete Example(s)
A typed config entry point in the app (TypeScript with Zod):
import { z } from "zod";
const Config = z.object({
DATABASE_URL: z.string().url(),
JWT_SIGNING_KEY: z.string().min(32),
SENTRY_DSN: z.string().url().optional(),
FEATURE_NEW_BILLING: z.coerce.boolean().default(false),
LOG_LEVEL: z.enum(["debug", "info", "warn", "error"]).default("info"),
});
export const config = Config.parse(process.env);
The Python/Pydantic equivalent for a FastAPI capstone:
from pydantic_settings import BaseSettings
class Config(BaseSettings):
database_url: str
jwt_signing_key: str
sentry_dsn: str | None = None
feature_new_billing: bool = False
config = Config() # raises on missing required
A Terraform fragment that writes exactly one secret container and mounts it, with no secret value in Terraform state:
resource "google_secret_manager_secret" "db_url" {
secret_id = "DATABASE_URL"
replication { auto {} }
}
resource "google_cloud_run_v2_service" "api" {
# ...
template {
containers {
env {
name = "DATABASE_URL"
value_source {
secret_key_ref {
secret = google_secret_manager_secret.db_url.secret_id
version = "latest"
}
}
}
}
}
}
The actual secret value is written out of band:
# One-time: write the value into the store (never checked into git)
echo -n "postgres://user:REAL_PASSWORD@/db?host=/cloudsql/..." | \
gcloud secrets versions add DATABASE_URL --data-file=-
No secret values appear in Terraform. They are written to the store out-of-band, mounted by reference, and rotated there.
A pre-commit gitleaks scan so a committed secret fails the PR:
repos:
- repo: https://github.com/gitleaks/gitleaks
rev: v8.18.4
hooks:
- id: gitleaks
Common Confusion / Misconceptions
- "I'll commit
.env.examplewith real keys to save time.".env.examplemust contain only placeholder names, never real values -- not even "the staging key, because staging is low-value." Any key in the repo is leaked the moment the repo is cloned or forked. - "My CI can read secrets from an encrypted variable." True, but the secret store is still the single source of truth. CI pulls from the store by role (OIDC), it does not hold the secret itself. Storing a long-lived secret in CI variables creates a parallel copy you will forget to rotate.
- "Terraform variables are safe because they aren't committed." They are persisted into state, which Terraform writes to your backend. State is accessible to anyone with backend read access, and anything marked
sensitivestill lives there in plaintext. Use secret-manager references, not variables, for real secrets. - "Rotation is a yearly chore." Rotation is a one-minute operation in any modern store; the hard part is making rotation not break the app. Design for zero-downtime rotation (overlap window, cached values with TTL) from day one.
How To Use It (In Your Capstone)
- Pick one secret store per env and document it in
SECURITY.md. Stop picking new stores. - Add a typed config schema at the app boot path. Fail boot loudly if anything required is missing.
- Write one ADR listing every secret, its owner, its rotation cadence, and its blast radius (
library/raw/decisions/004-secrets.md). - Add a
READMEsection: "What to put in the secret store before first deploy," with exact CLI commands. - Add a pre-commit hook or CI scan (
trufflehog,gitleaks) so a committed secret fails the PR. - Never put secret values in Terraform variables; create only the secret container in IaC and write values with
gcloud/aws/azCLI out of band. - Grant the app's runtime identity (service account, managed identity) read access to exactly the secrets it needs -- not the whole store.
Rotation Without Downtime
Every secret needs a rotation path that does not require a deploy:
- the store supports multiple active versions (GCP Secret Manager "versions", AWS Secrets Manager "staging labels")
- the app either reloads config on SIGHUP or uses a short-TTL cache (60-300s)
- both old and new values are valid during the rotation window; new writes use the new value, old in-flight requests complete against the old
Test rotation on staging before you need it on prod. A rotation that breaks the app is a rotation you will avoid -- which is how stale keys linger for quarters.
See also (integrative)
- S9 M05 Cluster 2: Secret management, vaults, dynamic secrets, rotation -- full treatment; capstone is the minimal instance
- S9 M05 Cluster 2: Encryption at rest/in transit, KMS envelope -- how the secret store protects values at rest
- S9 M05 Cluster 1: Identity-centric security (least privilege) -- scoping the runtime identity to its secrets
- S9 M02 Cluster 1: State -- ground truth and its hazards -- why state can contain secrets and must itself be protected
- S9 M04 Cluster 5: Pipeline security -- secrets, OIDC, least privilege -- the CI side of the same problem
- AWS Secrets Manager: What is -- managed secret store with rotation
- Google Secret Manager -- secret versions, IAM, mounting into Cloud Run
- HashiCorp Vault: Concepts -- dynamic secrets and lease-based rotation
- gitleaks -- pre-commit scanner for secret leaks
Check Yourself
- Where does a new developer look first to find what env vars the app needs?
- If you leaked the repo tomorrow, which of its contents would force a key rotation?
- How do you rotate the database password without a redeploy?
- What is the blast radius of the JWT signing key -- which service(s) and which user population?
- Which secrets does your runtime identity not have read access to, and why?
- What is the failure mode when the secret store is unreachable at boot?
Mini Drill or Application (Capstone-scoped)
- Env var inventory (25 min). List every env var the app currently reads. For each: "secret or config?", "which store owns it?", "per-env or global?". Migrate any that live in more than one place into exactly one.
- Typed config PR. Add a schema-validated config module (Zod/Pydantic/Go struct) to the boot path. Deliberately delete one required env var and verify the process refuses to start with a legible error.
- Rotation drill. Rotate one secret in staging without a deploy. Time the rollout. Write one paragraph in
library/raw/decisions/004-secrets.mddescribing the exact commands and the observed downtime (ideally zero).
Source Backbone
Capstone deployment applies cloud, delivery, and operations material. These books are the source backbone for the delivery decisions.
- Building Secure and Reliable Systems - secure/reliable deployment posture.
- GitHub Actions in Action - workflow automation support.
- Pro Git - release history, tags, and branch discipline.
- The Linux Command Line - shell and deployment automation support.