All insights

Terraform at Scale: Module Patterns That Work

Most Terraform problems come from bad module design, not from Terraform itself. These are the patterns we use for multi-account, multi-environment customers to keep changes safe and easy to review.

When customers tell us "Terraform is unmaintainable," we almost never find a problem with Terraform. We find a problem with modules. The modules are too thin, too clever, too connected to each other, or try to do too much. The risk comes from the design, not the tool.

These are the rules we follow on every project. They are simple on purpose. Simple is what you want at three in the morning when something goes wrong during an apply.

Rule 1: Modules represent patterns, not resources

A module that wraps a single resource adds complexity without adding value. If you cannot give the module a name that is clearly different from the resource it wraps, do not write the module. Just use the resource directly. Modules are useful when they represent a pattern: a service (compute + IAM + logging + alarms), a network tier, or a data pipeline.

Rule 2: Keep the tree flat

Use one or two levels of nesting. Never more. A root module uses reusable modules. Reusable modules can use other reusable modules. But if you go three levels deep, you will lose the ability to understand the state.

Rule 3: One state file per blast radius

Use separate state for each environment - always. Use separate state for each account - always. Keep Terraform focused on real infrastructure: networking, IAM roles, queues, databases, DNS, and other durable cloud resources. Product runtime configuration and deployments should move through GitOps where possible. A centralized S3 backend in a dedicated DevOps account is fine, as long as every account and environment has its own state path, lock, encryption, and IAM boundary. Ask yourself: "Can I destroy this state without breaking anything else?" If the answer is no, the boundaries are wrong.

  • Per-environment state: dev, staging, prod are each isolated.
  • Per-account state: a shared backend account is fine, but each AWS account needs a separate state path, lock, and access policy.
  • Platform vs. product flow: use Terraform for durable AWS infrastructure, and GitOps for application deployments, runtime config, and environment promotion where possible.
  • Pipeline-managed state only: never edit by hand, never run apply locally in shared environments.

Rule 4: Variables and outputs are the contract

Every variable needs a type, a description, and a sensible default. Values that do not change between environments (like a buffer size) get defaults. Values that are specific to an environment (like account ID or VPC ID) get no default. This forces the caller to set them explicitly. Never pass a module output through a variable into another module. Reference the resource attribute directly so Terraform can build the correct dependency graph.

Rule 5: Use stable keys, not indexes

One common anti-pattern is using count with a list and then inserting a new item in the middle of that list. Terraform identifies those resources by position, so every item after the change can look like a different resource. That can destroy and recreate resources that did not need to change. Use for_each with a set of strings or a map with stable keys instead. For objects, convert the list to a map keyed by a stable name or ID, so Terraform tracks identity by key, not by array position.

Rule 6: Pin versions, automate the rest

Pin the Terraform version, the AWS provider version, and the module versions. Run terraform fmt, terraform validate, and tflint on every PR in CI. Run terraform plan and post the output as a PR comment. Apply only from the main branch through a pipeline with an approval step. Never let a developer run apply against shared state from their laptop.

  • terraform fmt + terraform validate as a pre-commit hook.
  • tflint with the AWS ruleset in CI.
  • Plan on every PR, posted as a PR comment.
  • Apply only from main, only through a pipeline, only with approval.
  • Drift detection every night - with an alert when it finds something.

Rule 7: Tags are not optional

Use the AWS provider's default_tags in the root module. This makes every resource carry Environment, Project, Owner, CostCenter, and ManagedBy=Terraform tags. This is what makes cost tracking and incident response possible later. Resources without tags become invisible. Invisible resources do not get optimized or removed.

Rule 8: Promote with code, not with copies

Environments share modules, but each environment should have clear state boundaries. Use tfvars or locals - either is fine. The important part is the state layout: each environment should be split by real blast radius, such as core infrastructure, databases, and product infrastructure.

  • modules/ - reusable patterns (compute, network, data).
  • envs/dev/core, envs/staging/core, envs/prod/core - network, shared IAM, logging, and other core infrastructure.
  • envs/dev/dbs, envs/staging/dbs, envs/prod/dbs - databases and durable data services.
  • envs/dev/product-infra, envs/staging/product-infra, envs/prod/product-infra - queues, IAM roles, DNS, and other product infrastructure that still belongs in Terraform.
  • Same modules, different inputs through tfvars or locals. Promotion is a Git workflow, and application deployments stay in GitOps where possible.

Want a second opinion on your Terraform layout? Our DevOps assessment includes a module-design review and a written list of improvements. 45 minutes, no slides.

Want to talk through this for your stack?

Free 30-minute call. No commitment.

Book a call

More from the field

Reserved Instances vs. Savings Plans - Which One Saves You More MoneySOC 2 on AWS: The Controls That Get AuditedISO 27001 vs. SOC 2: What Each Audit Checks
Book a Free Call