Skip to main content

Secret Management: Vaults, Dynamic Secrets, Rotation

What This Concept Is

A secret is any value whose disclosure compromises a trust boundary: database passwords, API tokens, private keys, signing keys, OAuth client secrets. Secret management is the lifecycle around those values:

  1. Generation -- created by a system, not a human, with enough entropy
  2. Storage -- held in a purpose-built store, not env files or source control
  3. Distribution -- handed to workloads just-in-time through a verified identity
  4. Rotation -- regularly replaced before compromise, not after
  5. Revocation -- immediately invalidated when a system suspects compromise

A secrets vault (HashiCorp Vault, AWS Secrets Manager, Google Secret Manager, Azure Key Vault) is the system that coordinates these steps. The HashiCorp Vault docs frame its job around managing static secrets, certificates, identities and authentication, and third-party secrets, with features like encrypting sensitive data and integrating with certificate authorities.

A dynamic secret is a credential that the vault creates on demand, scoped narrowly, and revokes automatically. A typical example: an application asks for a database credential, and the vault generates a fresh DB user valid for 30 minutes, with only the privileges this app needs.

Why It Matters Here

Secrets leak. They leak through source control, build logs, screenshots, stolen laptops, compromised CI runners, and over-permissive IAM. Every long-lived secret is a liability stored indefinitely.

The vault/dynamic-secret model changes the question from "how do we prevent every leak" to "how do we keep the blast radius small". A credential that is 15 minutes old, scoped to one database role, and audit-logged is a very different thing than a 2-year-old .env full of production keys.

Concrete Example

An application service needs to read rows from a Postgres database.

Static-secret approach (wrong):

  • A DBA creates appuser, writes the password into Slack
  • An engineer copies it into config/prod.env
  • It is checked into a private repo (or not, but the build system has it)
  • It lives unchanged for two years and is touched by every incident responder, every CI run, every intern

Dynamic-secret approach (right):

  1. The app starts up and authenticates to the vault using workload identity (Kubernetes service-account JWT, AWS IAM + IRSA, GCP workload identity, or mTLS cert)
  2. The vault verifies the identity and the policy attached to it
  3. The app asks for a Postgres credential from a database secrets engine
  4. The vault creates a new DB user v-app-<random>-<ts> with a strong random password, grants the role app_reader, and sets a 30-minute lease
  5. The app uses that credential for the session
  6. When the lease expires or the app shuts down, the vault revokes the DB user

A Vault policy for step 2 looks like:

path "database/creds/orders-reader" {
capabilities = ["read"]
}
path "secret/data/orders/*" {
capabilities = ["read"]
}

And the Postgres secret-engine role (step 3) pins the SQL that vault runs to mint and revoke users:

resource "vault_database_secret_backend_role" "orders_reader" {
backend = "database"
name = "orders-reader"
db_name = "orders"
creation_statements = [
"CREATE ROLE \"{{name}}\" WITH LOGIN PASSWORD '{{password}}' VALID UNTIL '{{expiration}}'",
"GRANT app_reader TO \"{{name}}\""
]
revocation_statements = ["DROP ROLE \"{{name}}\""]
default_ttl = "30m"
max_ttl = "2h"
}

The operator never sees a password. The pipeline never stores one. If the pod is compromised, the attacker gets a 30-minute, read-only credential to one DB. If the vault is compromised, every lease is revocable in a single admin call.

Common Confusion / Misconception

"We already use a secrets store." That is not the same as "we have secret management". A store that just holds strings is better than .env, but if those strings are long-lived, shared, and never rotated, you have moved the problem into a nicer UI.

"Environment variables are insecure storage." They are a distribution mechanism, not storage. They are fine at the moment of process start, as long as the source is a vault and the values are short-lived. The problem is writing them to files, logging them, passing them through shell history, or leaking them into crash dumps. Kubernetes projected service-account tokens and volume-mounted vault leases are safer than env vars because they can rotate in place.

"Rotation is changing the value." It is "change the value, invalidate everything that used the old one, and verify nothing broke". A rotation you cannot verify is not a rotation. Dynamic secrets make this trivial because the TTL does the invalidation for you; static secrets require a separate revocation path.

"The vault itself is the trust anchor." The vault's auth method is the trust anchor. A vault protected by a shared root token is just another static secret. Use cloud IAM, Kubernetes auth, or mTLS as the auth method so the vault's own access is identity-bound too.

"Secret sprawl is an engineering problem." It is usually an organizational one. If secrets can be created without going through the vault (e.g. "I'll just paste it into CI"), they will be. A vault without a policy that forbids out-of-band secrets is a vault that catches the well-behaved half of the organization.

How To Use It

For every secret in your system ask:

  1. Where is this stored? (If the answer is "in the repo" or "in the CI UI", fix first.)
  2. What identity is allowed to read it? (If the answer is "anyone in the account", fix.)
  3. How long is it valid? (If the answer is "forever", make it shorter.)
  4. What happens if we rotate it tomorrow? (If nobody knows, you have discovered something.)
  5. Is there an audit log of who fetched it? (If not, that is an R finding from Concept 1.)

Check Yourself

  1. Why is a vault better than a shared password manager for service-to-service credentials?
  2. What does "dynamic" buy you that a static secret with rotation does not?
  3. How would you handle a secret that must be static (e.g. a third-party API key that cannot be rotated without vendor coordination)?

Mini Drill or Application

For a system you know, list every secret you can think of (start with database, API keys, signing keys, SSH keys, OAuth clients). For each one write:

  • current storage
  • current rotation cadence
  • what would break if you rotated it tomorrow at 2 p.m.

The secrets with answers like "unknown / unknown / unknown" are the ones to work on first.

See also (external)

Depth Path


Source Backbone

Security and observability require official docs, but these books provide the systems and reliability backbone behind the practices.