Writing a Reusable Terraform Module
What This Concept Is
A Terraform module is a directory with .tf files that can be called from another configuration. The root module is the directory you run terraform apply in. A child module is one called via a module "name" { source = "..." } block.
A reusable module is one that other people can use without reading your code to figure out what to pass in. That adds non-trivial requirements on top of "it works":
- a clear, documented input contract (variables)
- predictable outputs
- opinionated defaults so the 80% case is one line
- no assumptions about the caller's file layout or naming
- a version tag you can pin against
If any of those are missing, the "module" is a snowflake wrapped in a directory.
Why It Matters Here
The moment a second engineer copy-pastes your root module into another repo, you have a maintenance problem. Properly extracted modules solve three problems at once:
- Deduplication -- one definition of "a production VPC," not seventeen copies
- Governance -- security patterns (encryption, logging, tags) live in the module, not in every caller
- Review scope -- changes to the module review once, not per caller
This is the same reason we write functions in code: one abstraction, many call sites.
Concrete Example
A tiny but realistic VPC module structure:
modules/vpc/
main.tf # aws_vpc, subnets, route tables, IGW
variables.tf # name, cidr, azs, enable_nat
outputs.tf # vpc_id, public_subnet_ids, private_subnet_ids
locals.tf # name_prefix, base_tags
versions.tf # required_providers + required_version
README.md # contract, examples, tradeoffs
variables.tf excerpt:
variable "name" {
description = "Short identifier used in resource names (e.g., 'platform-prod')."
type = string
}
variable "cidr" {
description = "IPv4 CIDR block for the VPC."
type = string
default = "10.0.0.0/16"
}
variable "azs" {
description = "Availability zones to spread subnets across. Two minimum."
type = list(string)
validation {
condition = length(var.azs) >= 2
error_message = "At least two AZs are required for HA."
}
}
variable "enable_nat" {
description = "Whether to provision a NAT gateway for private subnets."
type = bool
default = true
}
And the caller:
module "vpc" {
source = "git::https://github.com/acme/tf-modules.git//vpc?ref=v1.4.2"
name = "platform-prod"
azs = ["us-east-1a", "us-east-1b", "us-east-1c"]
# cidr and enable_nat take the module defaults
}
resource "aws_instance" "web" {
subnet_id = module.vpc.private_subnet_ids[0]
# ...
}
Two things to notice:
- the caller passes two lines of config and gets a production-shaped VPC
- the
?ref=v1.4.2pins a git tag, so an upstream refactor cannot break this stack on a random Tuesday
Anatomy of a Good Module Contract
| Area | Good | Bad |
|---|---|---|
| Inputs | 3-10, named after the user's problem | 30, named after the resource's attributes |
| Defaults | the 80% case works with zero config beyond name | every input required |
| Outputs | IDs, ARNs, DNS names downstream actually need | everything, "in case" |
| Versioning | semver git tags or registry releases | main branch, no tags |
| README | usage example + one-paragraph tradeoffs | absent or auto-generated |
| Nesting | 1 level if you must, never 3 | deep trees nobody can trace |
Common Confusion / Misconception
"A module should wrap each resource type." No. Modules should encapsulate higher-level concepts (a VPC, a service, an ECS task + alarms + log group) not a single aws_instance. A 1:1 wrapper module adds indirection without abstraction. The official docs call this out explicitly.
"More inputs = more flexible." More inputs = more documentation, more breakage surface, more ways to call the module wrong. Opinionated modules with narrow inputs outlast flexible ones.
"I'll pin the version later." No you won't. Pin on first use. git::...?ref=main will bite you.
"Every module needs its own git repo." A monorepo of modules (modules/vpc, modules/ecs-service, modules/rds-postgres) is perfectly fine and often better -- shared linting, shared CI, one PR for cross-module changes.
How To Use It
- Extract a module only when there is a second caller. Premature modules are worse than inline code.
- Start with
name, 2-3 required inputs, and everything else defaulted. Grow the contract reluctantly. - Every module has
versions.tfwithrequired_providers-- this is non-negotiable for multi-caller modules. - Write the README before the module matures. If you cannot describe what it does in two paragraphs, the module is not cohesive.
- Tag releases. Consumers pin. Both you and them sleep better.
Check Yourself
- When should you NOT extract a module?
- Why should
sourcealways include a version pin in production callers? - What goes wrong if a module takes 25 required inputs?
Mini Drill or Application
Take the mini-module you built in Concept 05 and convert it into a callable child module:
- move it into
modules/<name>/ - add a
versions.tfwithrequired_providersandrequired_version - call it twice from a root module with different inputs (different
env, differentinstance_count) - verify both instances come up, tagged correctly
Then review your variable list. Can you delete or default any? Good modules shrink over time.
See also (external)
- Terraform Language: Creating modules -- structure, when to write a module, and when not to.
- Terraform Best Practices: Module structure -- a well-known opinionated take that aligns with most production practice.
Source Backbone
Infrastructure-as-code details are tool-specific, but these local books provide the operational backbone for shell, Git, and change discipline.
- Pro Git - versioned infrastructure changes, branching, review, and rollback habits.
- Git from the Bottom Up - mental model for stateful change history.
- The Linux Command Line - shell and automation grounding for infrastructure work.