deltakroneker
deltakroneker

Reputation: 771

How to organize Terraform modules for multiple environments?

Every Terraform guide on the web provides a partial solution that is almost always not the real picture. I get that not everyone has the same infrastructure needs, but what worries me is that the common scenario, including:

  1. Multiple environments (e.g. dev, stage)
  2. Remote backend (e.g. AWS S3)
  3. Some basic resources (e.g. S3 buckets or EC2 instances)

isn't presented anywhere on a real example project. I'm looking for just that.

In the meantime, I have researched and concluded that apart from those needs, I also want:

  1. To utilize modules.
  2. To NOT use workspaces, but rather a distinct directory-per-environment approach.
  3. To NOT use a terragrunt wrapper.

This is my current structure, which does not utilize modules - only a root module:

infra/ --------------- 'terraform init', 'terraform apply' inside here*  
     main.tf --------- Sets up AWS provider, backend, backend bucket, DynamoDB table   
     terraform.tfvars   
     variables.tf ---- Holds few variables such as aws_region, project_name...

I think my desired structure folder tree (for a simple dev & staging simulation of a single bucket resource) would look something like this:

infra/  
     dev/  
        s3/  
            modules.tf ------ References s3 module from local/remote folder with dev inputs   
     stage/  
        s3/  
            modules.tf ------ References s3 module from local/remote folder with stage inputs   

But what about the files from my previous root module? I still want to have a remote backend in the same way as before - just that now I want to have two state files (dev.tfstate and stage.tfstate) in the same backend bucket? What would the backend.tf files look like in each sub-directory, and where would they be? In s3/ folder or dev/ folder?

It's kind of confusing since I'm transitioning from a root module terraform init approach, to a specific sub-directory terraform init. It's not clear to me whether I should still have a root-module or another prerequisite folder (for example: global/), which I should init at the beginning of the project, and which is basically left alone from that point on (since it created the buckets which dev/ and staging/ reference).

One more question: if I have s3/, ec2/, and ecr/ sub-directories inside each environment, where do I execute the terraform plan command? Does it traverse all sub-directories?

When I have the answers and a clear picture of this above, it would be great to improve it by DRYing it up, but for now, I value a more practical solution through example rather than just a theoretic DRY explanation.

Upvotes: 43

Views: 45854

Answers (5)

Daniel Hornik
Daniel Hornik

Reputation: 2521

I've worked with Terraform for 5 years and I've made a lot of mistakes with in my career with modules and environments. Below reflects my share of knowledge and experience. It may be bad.

Real example projects may be hard to find because Terraform is not used to create open source projects. It's often unsafe to share Terraform files, because you are sharing all potential vulnerabilities in your infrastructure.

Module purpose and size

You should create modules that have a single purpose, but your modules should be generic.

For example, you can create a bastion host module, but a better idea is to create a module for a generic server. This module may have some logic dedicated to your business problems, like CloudWatch log groups, some generic security group rules, etc.

Application modules

Sometimes it is worth to create more specific modules.

Let's say you have application, that requires Lambda, an ECS service, CloudWatch alarms, RDS, EBS etc. All of those elements are strongly connected.

You have three options:

  1. Create separated modules for each above items: but then your application uses 5 modules.
  2. Create one big module and then you can deploy your app with single module
  3. Mix the above solutions: I prefer that.

Everything depends on details and some circumstances, but I will show you how I use Terraform in production at different companies.

Separated definitions for separated resurces

This is a project where you have environment as directories. For each application, networking, data resources you have separated state. I keep mutable data in separate directories (like RDS, EBS, EFS, S3, etc.) so all apps, networking, etc. can be destroyed and recreated, because they are stateless. No one can destroy stateful resources, because data can be lost. This is what I was doing for the last few years.

project/
├─ packer/
├─ ansible/
├─ terraform/
│  ├─ environments/
│  │  ├─ production/
│  │  │  ├─ apps/
│  │  │  │  ├─ blog/
│  │  │  │  ├─ ecommerce/
│  │  │  ├─ data/
│  │  │  │  ├─ efs-ecommerce/
│  │  │  │  ├─ rds-ecommerce/
│  │  │  │  ├─ s3-blog/
│  │  │  ├─ general/
│  │  │  │  ├─ main.tf
│  │  │  ├─ network/
│  │  │  │  ├─ main.tf
│  │  │  │  ├─ terraform.tfvars
│  │  │  │  ├─ variables.tf
│  │  ├─ staging/
│  │  │  ├─ apps/
│  │  │  │  ├─ ecommerce/
│  │  │  │  ├─ blog/
│  │  │  ├─ data/
│  │  │  │  ├─ efs-ecommerce/
│  │  │  │  ├─ rds-ecommerce/
│  │  │  │  ├─ s3-blog/
│  │  │  ├─ network/
│  │  ├─ test/
│  │  │  ├─ apps/
│  │  │  │  ├─ blog/
│  │  │  ├─ data/
│  │  │  │  ├─ s3-blog/
│  │  │  ├─ network/
│  ├─ modules/
│  │  ├─ apps/
│  │  │  ├─ blog/
│  │  │  ├─ ecommerce/
│  │  ├─ common/
│  │  │  ├─ acm/
│  │  │  ├─ user/
│  │  ├─ computing/
│  │  │  ├─ server/
│  │  ├─ data/
│  │  │  ├─ efs/
│  │  │  ├─ rds/
│  │  │  ├─ s3/
│  │  ├─ networking/
│  │  │  ├─ alb/
│  │  │  ├─ front-proxy/
│  │  │  ├─ vpc/
│  │  │  ├─ vpc-pairing/
├─ tools/

To apply a single application, you need to do:

cd ./project/terraform/environments/<ENVIRONMENT>/apps/blog;
terraform apply;

You can see there are a lot of directories in all environments, and I see some pros and cons:

Cons:

  • It is hard to check if all modules are in sync
  • Complicated CI
  • Complicated directory structure, especially for new people in the team, but it is logical
  • There may be a lot of dependencies, but this is not a problem when you think about it from the beginning
  • You need to take care to keep exactly the same environments.
  • There is a lot of initialization required, and refactors are hard to do

Pros:

  • Quick apply after small changes
  • Separated applications and resources. It is easy to modify small modules or perform small deployments without knowledge about the overall system
  • It is easier to clean up when you remove something
  • It's easy to tell what module need to be fixed. I wrote some tools to analyze the status of particular parts of infrastructure, and I can send email to particular developers that their infrastructure needs resyncing for some reason.
  • You can maintain different environments easier than in a monolithic model. You can destroy single applications if you do not need them in an environment.

Monolith infrastructure

Last time I started working with new company. They kept their infrastructure definition in few huge repositories (or folders), and when you do a terraform apply, you create all applications at the same time.

project/
├─ modules/
│  ├─ acm/
│  ├─ app-blog/
│  ├─ app-ecommerce/
│  ├─ server/
│  ├─ vpc/
├─ vars/
│  ├─ user/
│  ├─ prod.tfvars
│  ├─ staging.tfvars
│  ├─ test.tfvars
├─ applications.tf
├─ providers.tf
├─ proxy.tf
├─ s3.tf
├─ users.tf
├─ variables.tf
├─ vpc.tf

Here, you prepare different input values for each environment. So if you want to apply changes to production (for example):

terraform apply -var-file=vars/prod.tfvars -lock-timeout=300s

Apply staging:

terraform apply -var-file=vars/staging.tfvars -lock-timeout=300s

Cons:

  • You have no dependencies - but sometimes you need to prepare some environmental elements like domains, elastic IPs, etc. manually, or you need to have them created before terraform plan/apply. Then you have a problem.
  • Its hard to do cleanup, as you have hundreds of resources and modules at the same time
  • Extremely long Terraform execution. Here it takes around 45 minutes to plan/apply single environment.
  • It's hard to understand the entire environment.
  • Usually you need to have two or three repositories if you keep that structure to separate networking, apps, DNS, etc.
  • You need to do much more work to deal with different environments. You need to use count, etc.

Pros:

  • It's easy to check if your infrastructure is up to date
  • There is no complicated directory structure.
  • All your environments are exactly the same.
  • Refactoring may be easier, because you have all resources in very few places.
  • A small number of initializations are required.

Summary

As you can see, this is more of an architectural problem. The only way to learn it is to get more experience or read some posts from other people. I am still trying to figure out the most optimal way, and I would probably experiment with first way. Do not take my advantages as a sure thing. This post is just my experience - maybe not the best.

References

I will post some references that helped me a lot:

Upvotes: 62

Jatin Mehrotra
Jatin Mehrotra

Reputation: 11523

There is official guide for module structure based on official style guide by terraform

https://developer.hashicorp.com/terraform/language/style#module-structure

Upvotes: 0

bobbyD
bobbyD

Reputation: 21

Old article but thought I'd add my view as it's such a common question and there is no right or wrong approach (except to say that one massive deployment for ALL resources, that takes 20 minutes to figure out a Plan is asking for trouble as the blast radius would be huge). There's no hard rule for size of deployment, but I try to go with a rule of thumb of around 20-30 resources (max) and of course common sense. If it takes 10 minutes for TF to figure out the plan for adding a tag, then your deployment is probably too big.

After using Terraform for 4 or 5 years, I've tried all sorts, PowerShell wrappers, workspaces, terragrunt, pipelines & Terraform cloud. When using Open Source, I tend to go with an approach similar to @deltakroniker, using a different backend.tf file per environment as well as .tfvars. Run this from a pipeline to add approval gates etc and it works reasonably well, not perfect, but then what approach is?

It's similar to a workspace approach, except it allows you to specify different storage accounts for each env (when using Azure blob backend).

environments/
  dev/
    backend.tf
    environment.tfvars
  stage/
    backend.tf
    environment.tfvars 
tf-deploy/
  provider.tf
  main.tf
  variables.tf

plan or apply to an environment would be through command terraform plan --var-file=../environments/dev/environment.tfvars --backend-config=../environments/dev/backend.tf

Authentication to the backend is via environment variables (not in the backend.tf file). If done via a Pipeline then all sensitive vars can be gathered from a vault of some kind as part of the pipeline initialisation.

It's not perfect, you still have a question about how you try new module or provider versions, but don't want to promote to higher environments (with this approach, what you get in Dev, you ultimately get in Prod). In this case, approval gates and management of these type of changes becomes key. Alternatively, incorporating some kind of branched deployment for these type of changes could be an option.

Upvotes: 2

deltakroneker
deltakroneker

Reputation: 771

I realized as @MarkB suggested, that terraform workspaces are actually a solution to multi-env projects.

So my project structure looks something like this:

infra/
  dev/
    dev.tfvars
  stage/
    stage.tfvars 
  provider.tf
  main.tf
  variables.tf

main.tf references modules, provider.tf set's up the provider, backend.tf would set up the remote backend (yet to add), etc.

The 'terraform plan' in this configuration becomes 'terraform plan -var-file dev/dev.tfvars' where I specify the file with a specific configuration for that environment.

Upvotes: 17

yi1
yi1

Reputation: 413

I can share what we ended up doing for our Indeni Cloudrail service. Hope it'll help.

We created a folder with all the modules. Then, there's a module called "all" which basically calls the other modules (s3, acm, etc.) with the right parameters. The "all" modules has variables.

Then, there are environments. Each of them calls the "all" module with specific values for these variables.

This is the output of a "find" command on the root of the Terraform code (sorry it isn't prettier). I removed many of the files as they weren't needed to get the point across:

./common.tfvars
./terragrunt.hcl
./environments
./environments/prod
./environments/prod/main.tf
./environments/prod/terragrunt.hcl
./environments/prod/lambda.layer.zip
./environments/prod/terraform.tfvars
./environments/prod/lambda.zip
./environments/prod/common.tf
./environments/dev-john
./environments/dev-john/main.tf
./environments/dev-john/terragrunt.hcl
./environments/dev-john/terraform.tfvars
./environments/dev-john/common.tf
./environments/mgmt-dr
./environments/mgmt-dr/data.tf
./environments/mgmt-dr/main.tf
./environments/mgmt-dr/terragrunt.hcl
./environments/mgmt-dr/network.tf
./environments/mgmt-dr/terraform.tfvars
./environments/mgmt-dr/jenkins.tf
./environments/mgmt-dr/keypair.tf
./environments/mgmt-dr/common.tf
./environments/mgmt-dr/openvpn-as.tf
./environments/mgmt-dr/tgw.tf
./environments/mgmt-dr/vars.tf
./environments/staging
./environments/staging/main.tf
./environments/staging/terragrunt.hcl
./environments/staging/terraform.tfvars
./environments/staging/common.tf
./environments/mgmt
./environments/mgmt/data.tf
./environments/mgmt/main.tf
./environments/mgmt/terragrunt.hcl
./environments/mgmt/network.tf
./environments/mgmt/terraform.tfvars
./environments/mgmt/route53.tf
./environments/mgmt/acm.tf
./environments/mgmt/jenkins.tf
./environments/mgmt/keypair.tf
./environments/mgmt/common.tf
./environments/mgmt/openvpn-as.tf
./environments/mgmt/tgw.tf
./environments/mgmt/alb.tf
./environments/mgmt/vars.tf
./environments/develop
./environments/develop/main.tf
./environments/develop/terragrunt.hcl
./environments/develop/terraform.tfvars
./environments/develop/common.tf
./environments/preproduction
./environments/preproduction/main.tf
./environments/preproduction/terragrunt.hcl
./environments/preproduction/terraform.tfvars
./environments/preproduction/common.tf
./environments/prod-dr
./environments/prod-dr/main.tf
./environments/prod-dr/terragrunt.hcl
./environments/prod-dr/terraform.tfvars
./environments/prod-dr/common.tf
./environments/preproduction-dr
./environments/preproduction-dr/main.tf
./environments/preproduction-dr/terragrunt.hcl
./environments/preproduction-dr/terraform.tfvars
./environments/preproduction-dr/common.tf
./README.rst
./modules
./modules/secrets-manager
./modules/secrets-manager/main.tf
./modules/s3
./modules/s3/main.tf
./modules/cognito
./modules/cognito/main.tf
./modules/cloudfront
./modules/cloudfront/main.tf
./modules/cloudfront/files
./modules/cloudfront/files/lambda.zip
./modules/cloudfront/main.py
./modules/all
./modules/all/ecs.tf
./modules/all/data.tf
./modules/all/db-migration.tf
./modules/all/s3.tf
./modules/all/kms.tf
./modules/all/rds-iam-auth.tf
./modules/all/network.tf
./modules/all/acm.tf
./modules/all/cloudfront.tf
./modules/all/templates
./modules/all/lambda.tf
./modules/all/tgw.tf
./modules/all/guardduty.tf
./modules/all/cognito.tf
./modules/all/step-functions.tf
./modules/all/secrets-manager.tf
./modules/all/api-gateway.tf
./modules/all/rds.tf
./modules/all/cloudtrail.tf
./modules/all/vars.tf
./modules/ecs
./modules/ecs/cluster
./modules/ecs/cluster/main.tf
./modules/ecs/task
./modules/ecs/task/main.tf
./modules/step-functions
./modules/step-functions/main.tf
./modules/api-gw
./modules/api-gw/resource
./modules/api-gw/resource/main.tf
./modules/api-gw/method
./modules/api-gw/method/main.tf
./modules/api-gw/rest-api
./modules/api-gw/rest-api/main.tf
./modules/cloudtrail
./modules/cloudtrail/main.tf
./modules/cloudtrail/README.rst
./modules/transit-gateway
./modules/transit-gateway/attachment
./modules/transit-gateway/attachment/main.tf
./modules/transit-gateway/README.rst
./modules/transit-gateway/gateway
./modules/transit-gateway/gateway/main.tf
./modules/openvpn-as
./modules/openvpn-as/main.tf
./modules/load-balancer
./modules/load-balancer/outputs.tf
./modules/load-balancer/main.tf
./modules/load-balancer/vars.tf
./modules/lambda
./modules/lambda/main.tf
./modules/vpc
./modules/vpc/3tier
./modules/vpc/3tier/main.tf
./modules/vpc/3tier/README.rst
./modules/vpc/peering
./modules/vpc/peering/main.tf
./modules/vpc/peering/README.rst
./modules/vpc/public
./modules/vpc/public/main.tf
./modules/vpc/public/README.rst
./modules/vpc/endpoint
./modules/vpc/endpoint/main.tf
./modules/vpc/README.rst
./modules/vpc/isolated
./modules/vpc/isolated/main.tf
./modules/vpc/isolated/README.rst
./modules/vpc/subnets
./modules/vpc/subnets/main.tf
./modules/vpc/subnets/README.rst
./modules/guardduty
./modules/guardduty/README.md
./modules/guardduty/region
./modules/guardduty/region/main.tf
./modules/guardduty/region/guardduty.tf
./modules/guardduty/region/sns-topic.tf
./modules/guardduty/region/vars.tf
./modules/guardduty/.gitignore
./modules/guardduty/base
./modules/guardduty/base/data.tf
./modules/guardduty/base/guardduty-sqs.tf
./modules/guardduty/base/guardduty-lambda.tf
./modules/guardduty/base/variables.tf
./modules/guardduty/base/guardduty-kms.tf
./modules/guardduty/base/bucket.tf
./modules/guardduty/base/guardduty-sns.tf
./modules/guardduty/base/src
./modules/guardduty/base/src/guardduty_findings_relay.py
./modules/guardduty/base/src/guardduty_findings_relay.zip
./modules/jenkins
./modules/jenkins/main.tf
./modules/rds
./modules/rds/main.tf
./modules/acm
./modules/acm/main.tf

Upvotes: 4

Related Questions