Skip to main content

Module Quiz

Complete this quiz after finishing all concept and practice pages. The quiz mixes conceptual interpretation, HCL reading, and incident-style scenarios.

Current Module Questions

Question 1: Declarative vs Imperative

Classify each and justify in one sentence.

  • (a) A Bash script that calls aws ec2 run-instances, then aws ec2 create-tags.
  • (b) A resource "aws_instance" "web" {} block in a .tf file.
  • (c) An Ansible playbook with a list of tasks that install, configure, and start nginx.

Answer:

  • (a) Imperative. The script specifies a sequence of actions; the operator owns reconciliation.
  • (b) Declarative. The block specifies desired state; Terraform diffs against real infrastructure and decides actions.
  • (c) Hybrid -- task-ordered but module-declarative. Playbooks are ordered, but most modules (apt, file, service) implement declarative idempotency locally.

Question 2: Reading a Plan

Terraform will perform the following actions:

# aws_s3_bucket.artifacts will be updated in-place
~ resource "aws_s3_bucket" "artifacts" {
~ tags = {
~ "Env" = "dev" -> "development"
}
id = "acme-artifacts-dev"
}

# aws_db_instance.primary must be replaced
-/+ resource "aws_db_instance" "primary" {
~ engine_version = "14.10" -> "15.4" # forces replacement
id = "db-abc123" -> (known after apply)
- final_snapshot_identifier = null
}

Plan: 1 to add, 1 to change, 1 to destroy.

What should a reviewer say? Name three specific concerns and what the safe path is.

Answer:

  • (1) aws_db_instance.primary is being destroyed and recreated; unless final_snapshot_identifier is set and skip_final_snapshot = false, the data is gone.
  • (2) engine_version jumps from 14.10 to 15.4 in place of an in-place engine upgrade path. PostgreSQL major-version changes usually force replacement; the safe move is terraform apply -target on a new RDS resource, then manual data migration, then retire the old one.
  • (3) Bundling a trivial tag rename (dev -> development) with a destructive RDS change is a PR-hygiene failure. Split the PR.

Safe path: reject the PR; split into (A) tag rename (approvable in 30 seconds) and (B) RDS major upgrade (its own ticket with a migration plan, a moved or import strategy, and approval from the on-call DBA).

Question 3: State Corruption Scenario

Two engineers on the same team each run terraform apply against envs/prod/ within 60 seconds of each other. The backend is S3 without DynamoDB locking. What can happen, and what is the failure mode?

Answer: Without locking, two applys race. Each reads state A, both compute plans against A, both upload new state (B₁ and B₂) with no merge. The one whose upload lands last silently overwrites the other's. The result is that Terraform's state now disagrees with reality: some resources the first apply created are not in state (so the next apply will try to create them again and fail on "already exists"), and some changes the second apply made are unknown to the first engineer who ran it. Recovery requires terraform state reconciliation, import blocks, and -- if unlucky -- manual cloud-console forensics. The fix is dynamodb_table on the backend (or HCP Terraform, which locks by default).

Question 4: HCL Interpretation

variable "env" {
type = string
validation {
condition = contains(["dev", "staging", "prod"], var.env)
error_message = "env must be one of: dev, staging, prod."
}
}

locals {
bucket_name = "acme-${var.env}-artifacts"
is_prod = var.env == "prod"
}

resource "aws_s3_bucket" "this" {
bucket = local.bucket_name

lifecycle {
prevent_destroy = local.is_prod
}
}

What does this code guarantee? What does it not guarantee?

Answer: Guarantees:

  • var.env is constrained to a known set at plan time.
  • The bucket name contains the environment, preventing collisions across envs in the same account.
  • The bucket cannot be destroyed by terraform destroy or a resource-address removal when env == "prod".

Does not guarantee:

  • That nobody deletes the bucket out of band (AWS console). prevent_destroy is Terraform-local.
  • That an engineer cannot remove prevent_destroy in a one-line PR. Policy-as-code is needed for that.
  • That the bucket is encrypted, versioned, or private. Those attributes are absent from this snippet.

Question 5: moved Block Semantics

What is the difference between these two approaches to renaming aws_instance.app to aws_instance.api?

Approach A:

moved {
from = aws_instance.app
to = aws_instance.api
}

Approach B:

terraform state mv aws_instance.app aws_instance.api

Answer: Both end in the same state. A is declarative and lives in source control; the next engineer sees the rename in the diff and understands why. B is imperative and leaves no trace; the next engineer sees "address changed somehow" with no rationale. Approach A is strongly preferred for team workflows. Approach B is sometimes needed for edge cases (e.g., renames across modules in old Terraform versions without moved), and should be followed immediately by a commit adding a moved block (which is then a no-op against state but documents the intent).

Question 6: HCL Interpretation

resource "aws_security_group_rule" "web" {
for_each = var.allowed_cidrs
type = "ingress"
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = [each.value]
security_group_id = aws_security_group.web.id
}

What is the address of each rule, and what happens if you remove one entry from var.allowed_cidrs?

Answer: Each rule's address is aws_security_group_rule.web["<key>"], where the key is the map key in var.allowed_cidrs. Removing an entry removes the corresponding rule from state and from AWS on the next apply (plan shows a - action for that key). for_each with a keyed map is preferred over count, because removing an entry in the middle of a list shifts indices and causes Terraform to rebuild many unrelated rules.

Question 7: Remote State and Outputs

Team A's Terraform provisions a VPC and emits vpc_id as an output. Team B's Terraform needs vpc_id to provision services. Name three ways to expose the value across codebases and give the main tradeoff of each.

Answer:

  • Data source (terraform_remote_state) -- reads Team A's state directly. Simple, but couples Team B's ability to plan to Team A's state layout. A rename in Team A's module breaks Team B.
  • Data lookup (aws_vpc with a tag filter) -- queries AWS for a VPC with a known tag. Decouples state files; couples to a tag convention.
  • Parameter store / SSM / Secrets Manager -- Team A writes the output to SSM; Team B reads it. Decouples fully; adds an operational dependency and a manual step if the ID changes.

No perfect answer; tag-based lookups plus SSM as an escape hatch is a common choice.

Question 8: Policy as Code

You ship a Rego policy that forbids public S3 buckets. A team asks for an exception: their bucket hosts a static website and legitimately needs public-read. What is the correct response?

Answer: Do not bypass the policy for this one plan. Update the policy to allow an exception when a specific attribute or tag is present (e.g., tags.PublicIntentional = "true" with a justification field tags.PublicReason). Require a second reviewer to merge that tag. This keeps the policy honest: every public bucket is still gated, the exceptions are auditable, and future exceptions follow the same path. Ad hoc bypasses (// policy: ignore) are how policy engines lose authority.

Question 9: Workspaces vs Environments

A new hire suggests terraform workspace new prod to manage production separately from dev. What is wrong with this suggestion?

Answer: CLI workspaces share everything except the state file: same backend bucket, same codepath, same provider config. A typo in var.env == "prod" can silently deploy dev-shaped resources into prod, and a misfire of terraform workspace select prod can point commands at production state. Environments deserve physical separation: separate directories (envs/dev/, envs/prod/), separate backend config (separate state bucket and lock table), and often separate accounts. The industry convention -- codified in Terraform's own best-practices docs -- is to treat CLI workspaces as lightweight branches for local experimentation, not as environment boundaries.

Question 10: Drift Detection

What is drift, and why do terraform plan runs on an unchanged codebase sometimes show changes?

Answer: Drift is a difference between real infrastructure and Terraform's understanding of it. Plans show changes on unchanged code because someone changed the infrastructure out of band (console click, another tool, an autoscaler). The response depends on the diff: if it is a false positive (e.g., AWS adds a default attribute that Terraform does not manage), filter it via lifecycle { ignore_changes = [...] }; if it is a real out-of-band change, either adopt it (update the code) or revert it (let apply reconcile). Never silence drift with lifecycle { ignore_changes = [all] } -- that turns Terraform into a view, not a source of truth.

Question 11: Import Block

resource "aws_s3_bucket" "legacy_logs" {
bucket = "acme-legacy-logs"
}

import {
to = aws_s3_bucket.legacy_logs
id = "acme-legacy-logs"
}

What does terraform plan show, and how do you know the import is "clean"?

Answer: The plan shows an import line for aws_s3_bucket.legacy_logs. A clean import also shows no ~ actions -- just the import and nothing else. If the plan shows ~ changes to the bucket, it means the .tf configuration does not match the existing resource's attributes; Terraform will mutate reality on apply. The correct path is to iterate on the .tf until the plan is "import-only," then apply, then remove the import block.

Question 12: Blast Radius

Your root module manages VPC + 3 services + RDS + S3 + CloudFront. An engineer's refactor accidentally inverts a for_each. Running apply would destroy all three services. Name two structural patterns (not review processes) that reduce the blast radius of this class of mistake.

Answer:

  • Stack decomposition. Split network, data, and apps into separate root modules with separate states. A bug in the apps module cannot destroy the VPC because the VPC lives in a different state. (Changes propagate via tagged data lookups or SSM.)
  • prevent_destroy on high-value resources. The RDS and the CloudFront distribution should have lifecycle { prevent_destroy = true }. A destructive plan will fail at plan time, not during apply.

(Bonus: PR templates that require a pasted plan output, with automation that flags -/+ or - actions, so humans cannot skim past them.)

Interleaved Review Questions

Prior Module Question 1 (S9 M1: Cloud Foundations)

What is the difference between IAM roles and IAM users, and why does IaC prefer roles for service-to-service access?

Answer: Users are long-lived human identities with static credentials. Roles are short-lived, assumed at call time, produce temporary credentials (STS), and bind authorization to the calling principal (another service, an EC2 instance, a Lambda). IaC prefers roles for service-to-service access because role assumption produces no static secrets to rotate, leak, or commit to state. Terraform's AWS provider itself is typically configured to assume a role, not to carry a user's access key.

Prior Module Question 2 (S7 M5: ADRs and Reviews)

You write an ADR that says "we will use Terraform for all cloud infrastructure." Later the team wants to use CDK for one service. What changes about the ADR?

Answer: The ADR stays as the decision of record and is either superseded by a new ADR ("We now allow CDK where X conditions hold") or amended with a dated addendum. Do not silently let the exception happen; the invariant "all infra in Terraform" was load-bearing (single tool, single state model, one skill set on call). A new ADR names what changed and why, and spells out the conditions for choosing CDK over Terraform so the next team does not relitigate it.

Prior Module Question 3 (S1 M2: Unix and CLI)

What is idempotency in a shell-scripting context, and how does it relate to terraform apply?

Answer: In shell-scripting, an idempotent script can run repeatedly without changing the system after the first successful run (e.g., mkdir -p foo). terraform apply is idempotent at the whole-run level: applying the same configuration to the same state twice produces the same result; the second run's plan is "no changes." The guarantee extends through the resource lifecycle model (create / update / delete) and depends on provider authors implementing idempotent CRUD against the target API.

Prior Module Question 4 (S6 M-foundations: Git)

Why does terraform apply in CI need to check out the exact commit being reviewed, not main?

Answer: The plan a reviewer reads is tied to a specific commit. If main has moved (another PR merged since this one was opened), a post-merge CI apply would apply the combined effect of both PRs -- which neither reviewer saw or approved. CI must either apply the reviewed commit directly, or require rebase + re-plan + re-review before apply. This is the Terraform parallel to "don't merge without running CI against a fresh rebase."

Prior Module Question 5 (S5 Testing)

Why is a unit test for a Terraform module weaker than an integration test that actually plans against a provider?

Answer: Unit tests (HCL parsing, variable validation) catch syntax and schema errors. They cannot catch provider-level errors: resource attributes that the provider rejects, implicit dependencies, or plan diffs caused by provider default values. An integration test that runs terraform plan against a real (or localstack) provider catches those. The analog in application code is "your linter caught the typo; your integration test caught the bug."

Self-Assessment and Remediation

Mastery Level (90-100% correct):

  • Ready to own Terraform codebases at work. Focus next on policy-as-code in CI and on a real refactor of one monolith root module.

Proficient Level (75-89% correct):

  • Review the concept pages in the cluster where you missed. Most common gap is Cluster 4 (plan review discipline and refactoring).

Developing Level (60-74% correct):

  • Rework Practice 2 (modularity and state) and Practice 3 (refactoring clinic). The most likely symptom is missing state mechanics.

Insufficient Level (below 60% correct):

  • Restart from Cluster 1. The usual underlying gap is treating IaC as "writing YAML"; it is an engineering discipline about state, plans, and blast radius. Produce three ADR-style writeups of past infra incidents before attempting the rest.