Skip to main content

Modularity and State Workshop

Kata: Take Lab 1's single-file module and refactor it into a module boundary, then back it with remote state and state locking. This exercises Cluster 3 -- module contracts, remote state, and team safety.

Retrieval Prompts

  1. What distinguishes a reusable module from a root module?
  2. Name three things that should not be inside a reusable module (and belong in the root / composition layer instead).
  3. What is a backend block, and what does terraform init -migrate-state do?
  4. What is "state locking," mechanically -- who holds the lock, where, and for how long?
  5. Why are workspaces for environments discouraged in production-shaped orgs?

Compare and Distinguish

  • Module input contract vs provider configuration. A reusable module should not configure providers inline. Why?
  • terraform workspace new prod vs a separate envs/prod/ directory. Which is the norm and why?
  • Local state on your laptop vs S3 + DynamoDB remote state. What changes about the failure modes?
  • Polyrepo (repo per service) vs monorepo (all infra in one). What does each cost when the company grows from 5 to 50 engineers?

The Workshop

  1. Extract the module. Move your Lab 1 code into modules/storage-bucket/. Remove any provider block from the module -- providers must be configured in the root only.
  2. Compose two environments. Create envs/dev/ and envs/prod/ directories. Each calls the module with different name and environment inputs.
  3. Add remote state. Create an S3 bucket (or the GCS / Azure equivalent) and a DynamoDB lock table by hand once. Then wire a backend "s3" block per environment with dynamodb_table locking.
  4. Migrate. Run terraform init -migrate-state to move local state into the backend. Verify by running terraform state list against the backend.
  5. Prove locking. In two terminals, run terraform plan against envs/prod/ simultaneously. Capture the lock error one of them receives.
  6. Break state deliberately. In a scratch environment only, delete one resource from state with terraform state rm. Run plan. Observe what Terraform now believes (and what reality holds). Recover with terraform import.

Mini Application

Write a modules/storage-bucket/README.md that includes:

  • a one-sentence purpose statement
  • a Usage block with a module call example using a versioned source
  • an Inputs table (variable name, type, default, required, description)
  • an Outputs table
  • a Design notes paragraph answering: what is this module responsible for, and what is it explicitly not responsible for?

Common Mistake Check

Identify and fix:

  1. A reusable module contains provider "aws" { region = "us-east-1" }.
  2. envs/prod/main.tf and envs/dev/main.tf each have a different set of inputs to the module -- but one of them silently has a different resource schema.
  3. Two engineers committed terraform.tfstate to git "so both could use it."
  4. The DynamoDB lock table is acme-tf-locks, but the backend config for envs/prod/ was copy-pasted from envs/dev/ and still points at acme-tf-locks-dev. Why is this dangerous?
  5. terraform workspace new prod is used as the prod environment. What operator surprise is likely in the first year?

Evidence Check

This page is complete only when you can:

  • show a modules/ + envs/dev/ + envs/prod/ layout that applies cleanly
  • produce the two-terminal lock error in a screenshot
  • explain, in 60 seconds, how S3+DynamoDB locking works mechanically
  • recover from a terraform state rm using terraform import
  • justify why your module boundary is where it is (not somewhere else)