- Verification of prerequisites (Terraform CLI, Google Cloud account setup)
Install Terraform CLI, and Google Cloud SDK.
This will provide the terraform
binary.
- Windows: Download from the official installation page and follow the installation instructions.
- Windows (chocolatey): Run
choco install terraform
. - macOS: Use Homebrew with
brew install terraform
. - Linux: Go to the official installation page and follow the instructions for your distribution.
This will provide the gcloud
binary.
- Windows: Download from the official installation page and follow the installation instructions.
- Windows (chocolatey): Run
choco install gcloudsdk
. - macOS: Use Homebrew with
brew install --cask google-cloud-sdk
. - Linux: Go to the official installation page and follow the instructions for your distribution.
- Go to Google Cloud Console and log in with your Google account.
- Grab the project ID of the project you will be using for the workshop. You can find it in the project dashboard or by clicking on the project name in the top navigation bar.
In a terminal, run the following command to log in:
gcloud auth login
This will open a web browser for you to log in with your Google account. After logging in, you will be prompted to select a project. Choose the project you will be using for the workshop.
If you're not prompted to select a project, set the project with:
gcloud config set project cloud-labs-workshop-42clws
Then run:
gcloud auth application-default login
To verify your project, run:
gcloud config list
This part will focus on Terraform state fundamentals and basic state manipulation.
Terraform uses state to map resource configurations to real-world infrastructure. The state file (terraform.tfstate
) is a JSON file that contains the current state of your infrastructure. It is essential for Terraform to function correctly, as it allows Terraform to track resources and their dependencies. It will not contain all resources, only the resources Terraform know about.
The state file can contain sensitive information, so it should be stored securely. Terraform 1.10 introduced ephemerality, with the special ephemeral
block, that solves this issue partially by not storing secrets in state. You can read more about it in the docs.
Let's start with a simple example of a monolithic configuration. We've provided a starter configuration in infra/main.tf
that creates a couple of networks, DNS records and service accounts.
-
Create a new file
infra/terraform.auto.tfvars
with the following content:name_prefix = "<your-unique-prefix>" project_id = "cloud-labs-workshop-42clws"
Your
name_prefix
should be unique to avoid collisions with other participants. If you run the workshop in your own project, replace theproject_id
accordingly. -
Run
terraform init
andterraform apply
in theinfra/
folder to provision the resources. Contact a workshop facilitator if you have any troubles.
The previous commands will have created a .terraform/
directory, a .terraform.lock.hcl
and a terraform.tfstate
file. .terraform.lock.hcl
is the only file that should be committed. .terraform/
contain downloaded providers and modules.
-
Open
terraform.tfstate
in your favorite text editor. You will see a JSON file with the current state of your infrastructure. Theresources
key contains a list of all resources managed by Terraform, along with their attributes and metadata. -
terraform.tfstate
contains many attributes that are not relevant. Runterraform state list
to see a list of all resources managed by Terraform. You will see a list of resources with their addresses, such asgoogle_compute_network.vpc
andgoogle_dns_record_set.records[2]
. -
Run
terraform state show google_compute_network.vpc
to see the details of thegoogle_compute_network.vpc
resource. You will see a list of all attributes and their values, including theid
,name
,auto_create_subnetworks
, andproject
. -
Run
terraform show
to display the attributes of all Terraform-managed resources.
Terraform has two mechanisms for changing the address of a resource. terraform state mv
and the moved
block. The first one act on the state file directly, and the second is most commonly used when renaming resources in the configuration.
-
Run
terraform state mv google_compute_network.vpc google_compute_network.vpc_renamed
to rename thegoogle_compute_network.vpc
resource togoogle_compute_network.vpc_renamed
. This will update the state file, but not the configuration. -
Run
terraform plan
to examine how Terraform wishes to fix the configuration drift between the state file and the configuration file. Note especially how this affects dependant resources, like the subnets and the DNS zone. Do not apply this configuration! -
Let's fix the state file with a
moved
block:moved { from = google_compute_network.vpc_renamed to = google_compute_network.vpc }
Examine the output of
terraform plan
again. It should show the change to the state file, but also say "Plan: 0 to add, 0 to change, 0 to destroy" since this involves no actual changes. -
Apply the configuration (using
terraform apply
) before continuing. Then remove themoved
block from the configuration. -
The
moved
blocks andterraform state
commands also work on maps and lists of objects, such asgoogle_compute_subnetwork.subnets
(which has 3 resources). We'll do the same operation, but this time we'll start with themoved
block.Modify the address of the resources from
subnets
tosubnets_renamed
(inmain.tf
), and optionally runterraform plan
to see what happens. Then create amoved
block that renames the addresses in the state file. Runterraform plan
to verify that there are no changes other than the three address changes. -
Apply the configuration before undoing the changes in the configuration file. Running
terraform plan
should show 3 added and 3 destroyed resources. Useterraform state mv
to fix the state file, and runterraform plan
to verify that there are no changes.
You can read more about the available state manipulation commands.
"Drift" is when the defined state (or code) differs from the actual state of the infrastructure. This can happen for various reasons, such as manual changes made in the cloud provider's console, code changes not properly applied in an environment or changes made by other tools or scripts.
In order to handle drift, Terraform always executes a "refresh" before plan or apply operations to update the state file with real-world status. It will then reconcile the tracked resources in the state file with the actual status.
You can trigger a refresh manually with terraform refresh
or using the -refresh-only
flag with terraform plan
and terraform apply
. Refreshing the state is normally not necessary when running terraform plan
or terraform apply
, but can be useuful in special situations.
-
Next, go into the Google Cloud Console, and search for "VPC networks" in the top middle search bar. Click on the "VPC networks" link and find your VPC in the list. Click "Subnets" in the menu bar, and delete one of the subnets listed.
-
Let's see the effect of refreshing the state.
- Run
terraform plan -refresh=false
. You should not see any changes to be applied since the state file and configuration files are in sync. - Run
terraform plan -refresh-only -out=plan.tfplan
. You can see from the output which resources Terraform refreshes the status of. Terraform will generate a plan to update the state file. - Run
terraform apply plan.tfplan
. To apply the changes to the state file. - (Optional) You can compare the state file by looking at the difference between of
terraform.tfstate
andterraform.tfstate.backup
. - Run
terraform plan -refresh=false
again. Terraform will now detect a difference between the state file and the configuration. Terraform will show the plan to change the real-world state back to the desired state decided by the configuration.
- Run
-
Run
terraform apply
and apply the configuration to get the infrastructure back to the desired state.
Note
terraform apply -refresh-only
will give the option to update the state file without generating an intermediate state file (generally, all arguments given to terraform plan
can also be given to terraform apply
).
terraform refresh
can be used to refresh and apply the state to the state file directly without reviewing it. This is most similar to what is actually done by terraform plan
before generating the actual plan.
Our single configuration file is quite long and unmanageable. We can split it into multiple files to improve organization. As long as we keep the resources in files within the same directory, with the same resource name, Terraform will consider them the same. Moving files to different directories will create modules. For now, we will just create a logical split of files to simplify later refactoring.
Terraform has a style guide, which contains a section about file names. We will follow this style guide to reorganize our code.
-
Look up the file names section in the style guide.
-
Move all the non-resource blocks (variables, providers, etc.) into their respective files.
-
You should now have files named
terraform.tf
,providers.tf
,variables.tf
andmain.tf
. -
The remaining resources in
main.tf
can split into files based on their function (e.g.,dns.tf
,network.tf
, etc.), if you want to. -
When you're done, run
terraform plan
to verify that there are no changes to the configuration.
By default, Terraform stores the state locally in a terraform.tfstate
file. When using Terraform in a team it is important for everyone to be working with the same state so that operations will be applied to the same remote objects.
With remote state, Terraform writes the state data to a remote data store, which can be shared between all members of a team. Remote state can be implemented by storing state in Amazon S3, Azure Blob Storage, Google Cloud Storage and more. Terraform configures the remote storage a backend
block.
-
Run the following command to create a GCS bucket:
gcloud storage buckets create gs://<bucket_name> --project=cloud-labs-workshop-42clws --location=europe-west1
<bucket_name>
can be any globally unique string, we recommend<your_prefix>_state_storage_<random_string>
. The<random_string>
should be 4-6 random lower case letters or numbers.-
Update the Terraform configuration to use the provisioned bucket as a backend:
terraform { backend "gcs" { bucket = "<bucket_name>" prefix = "terraform/state" } }
-
Run
terraform init
and migrate the state to the bucket. -
Verify that the state is located in storage bucket (
<prefix>
is defined asterraform/tfstate
above):gcloud storage ls gs://<bucket_name>/<prefix>
-
If you want to view the contents of the state file run
gcloud storage cat <path_to_state_file>
. The path to the state file is found in the output of the previous command.
-
As long as the backend supports state locking, Terraform will lock your state for all operations that could write state. This will prevent others from acquiring the lock and potentially corrupting your state. Since GCS supports state locking, this happens automatically on all operations that could write state. This is especially important when working in a team or when automated workflows (such as CI/CD pipelines) may run Terraform simultaneously, as it ensures only one operation can modify the state at a time.
-
State lock can be verified by:
- Try changing the
google_dns_managed_zone.private_zone
resource name and runterraform apply
but leave it on approval prompt and then, in another terminal, runterraform plan
. You should see that the state file is locked by theterraform apply
operation.
- Try changing the
-
Also see GCS remote backend docs for more info.
Terraform modules improve code reuse, organization and readability. Modules can be used to create reusable components in a repository or create a library of reusable components shared between teams.
We'll start with a network
module that is responsible for creating both the VPC and the subnets. We'll also have to take care to not modify the existing resources, and will use the moved
block to avoid actual changes to the infrastructure.
Modules are defined in their own directory, and can be used by referencing the module's source. The module's source can be a local path, a Git repository or a Terraform registry. It's common to gather modules at the repository root in the modules/
folder.
-
Create the
modules/network
directory at the repository root. -
Create
modules/network/main.tf
and move thegoogle_compute_network
andgoogle_compute_subnetwork
resources into it. -
We'll need to pass variable definitions to the module. Modules follow the same naming conventions Create
modules/network/variables.tf
and copy the required variables fromvariables.tf
into it (name_prefix
,regions
,subnet_cidrs
fromvariables.tf
). Remove the variable defaults, if any. -
Other resources need to reference the VPC id, so we'll need a output. Create
modules/network/outputs.tf
with the following content:output "vpc_id" { description = "The ID of the VPC" value = google_compute_network.vpc.id }
-
Add a
module
block to replace the previous network configuration file to call the module:module "network" { source = "../modules/network" name_prefix = var.name_prefix regions = var.regions subnet_cidrs = var.subnet_cidrs }
And update
google_dns_managed_zone.private_zone
to refer to the module outputvpc_id
in thenetwork_url
argument:network_url = module.network.vpc_id
-
Run
terraform init
and thenterraform plan
to verify that the changes are syntactically correct. Fix errors before continuing. Note that the plan will show changes to the infrastructure! But, can do this without changes to the infrastructure by using themoved
block. -
When moving between modules, the
moved
block must be in the module you moved from (in this case the root module). Add thismoved
block in the same file as the module declaration:moved { from = google_compute_network.vpc to = module.network.google_compute_network.vpc } moved { # Note: We move all three subnets in the list at once from = google_compute_subnetwork.subnets to = module.network.google_compute_subnetwork.subnets }
Run
terraform plan
again and verify that there are no changes, except moving resources. -
Apply the moves with
terraform apply
. This will move the resources in the state file without changing the infrastructure. Runterraform plan
and see that the moves are no longer planned actions. -
Delete the
moved
block from the configuration file.
Caution
Removing moved
blocks in shared modules can cause breaking changes to consumers that haven't applied the move actions yet. This is not a problem here since we're the only consumer of the module. Read more in the docs.
The design of the network
module can be improved, we'll get back to ways to do that in the extra tasks section later in the workshop.
For the DNS configuration, we'll only create a module for creating a single DNS A record, leaving the DNS zone in the root module.
-
Following similar steps to creating the
network
module, createdns_a_record
module. This module should have threestring
variable inputsname
,zone_name
andipv4_address
, and create a singlegoogle_dns_record_set
resource. You will need to modify the resource to use the new variables. -
We can then use a loop with the
count
meta-argument in the root module when we call the module:module "records" { count = length(var.dns_records) source = "../modules/dns_a_record" name = "${var.dns_records[count.index]}.${var.name_prefix}.workshop.internal." zone_name = google_dns_managed_zone.private_zone.name ipv4_address = "10.0.0.${10 + count.index}" }
-
When refactoring resources that use looping this way, we need to use a moved block per resource:
moved { from = google_dns_record_set.records[0] to = module.records[0].google_dns_record_set.record } moved { from = google_dns_record_set.records[1] to = module.records[1].google_dns_record_set.record } moved { from = google_dns_record_set.records[2] to = module.records[2].google_dns_record_set.record }
-
Verify that the only actions are to move state, and apply the changes before removing the
moved
blocks.
The service_account
module is a bit different, since it creates multiple resources using loops. The logic could be greatly simplified if we designed to module to create a service account, and assign it a set of roles.
The service_account
module should have variables account_id
, display_name
, description
, project_id
and roles
. I.e, we want a that can replace the current looping logic with a module call similar to this:
module "service_accounts" {
count = length(var.service_accounts)
source = "../modules/service_account"
account_id = "${var.name_prefix}-${var.service_accounts[count.index]}"
display_name = "<removed for brevity>"
description = "<removed for brevity>"
project_id = var.project_id
roles = var.project_roles
}
- Create the
service_account
module inmodules/service_account
and use it. Note: This refactoring is not required to complete future tasks, and feel free to skip it or come back to it later if you want to.
- Comparing approaches:
- Workspace-based workflows
- Directory-based environments with multiple state files
- Terragrunt integration
- Hands-on lab: Implementing your chosen strategy with GCP projects
Terraform Cloud is a managed service provided by HashiCorp for running Terraform workflows in a collaborative and secure environment. It helps teams manage infrastructure as code at scale by handling Terraform execution, state storage, access control, version control integration and policy enforcement — all in the cloud. In this part we will explore how to store state in Terraform Cloud and run automatic plan on PR.
Go to Terraform Cloud and create a free account. Once signed in, create an organization to store your infrastructure. HCP Terraform organizes your infrastructure resources by workspaces in an organization.
A workspace in HCP Terraform contains infrastructure resources, variables, state data, and run history. HCP Terraform offers a VCS-driven workflow that automatically triggers runs based on changes to your VCS repositories. The VCS-driven workflow enables collaboration within teams by establishing your shared repositories as the source of truth for infrastructure configuration. Complete the following steps to create a workspace:
- After selecting an organization, click New and choose Workspace from the dropdown-menu.
- Choose a project to create the workspace in, and click create.
- Configure the backend to let Terraform Cloud manage your state:
terraform {
cloud {
hostname = "app.terraform.io"
organization = "<your-organization-name>"
workspaces {
name = "<workspace-name>"
}
}
}
Initialize the state with terraform init
.
In order to trigger HCP Terraform runs from changes to VCS, you first need to create a new repository in your personal GitHub account.
In the GitHub UI, create a new repository. Name the repository learn-terraform, then leave the rest of the options blank and click Create repository.
Copy the remote endpoint URL for your new repository.
In the directory of your source code, update the remote endpoint URL for your repository to the one you just copied. git remote set-url origin YOUR_REMOTE
, Add your changes, commit and push to your personal repository.
To connect your workspace with your new GitHub repository, follow the steps below:
- In your workspace, click on VCS workflow and choose an existing version control provider from the list or configure a new system. If you choose Github App,choose an organization and repository when prompted. You can choose your own private repositories by clicking on add_another_organization and selecting your github account.
- Under advanced options, set the Terraform Working Directory to infra and click Create.
Terraform Cloud also needs access to Google Cloud resources to be able to run necessary changes. We therefore need to add a workspace variable called GOOGLE_CREDENTIALS containing a service account key.
- Go to Google Cloud -> IAM & ADMIN -> Service Accounts and locate the terraform cloud service account (terraform-cloud-sa-clws@cloud-labs-workshop-42clws.iam.gserviceaccount.com)
- Under Actions, click on Manage keys and choose create new key under the Add key dropdown
- Head over to Terraform Cloud and under your workspace variables, add a variable named GOOGLE_CREDENTIALS with the service account key as value.
You should now be able to automate terraform plan on PR.
- Creating workflow files for Terraform
- Implementing secure credential handling for GCP
- Options:
- GitHub Actions setup
- Cloud Build integration
- Pull request automation (plan on PR, apply on merge)
- Hands-on exercises with:
count
vs.for_each
for
expressionsflatten
and other collection functions- Practical GCP examples (e.g., managing multiple GKE node pools)
- Performance considerations
- Use cases for dynamic blocks
- Lab: Implementing complex GCP resource configurations with dynamic blocks
- Discussion: Performance implications and maintainability trade-offs
- Best practices and anti-patterns
- Unit testing with Terratest
- Policy validation with OPA/Conftest
- Static analysis and linting
- Implementing pre-commit hooks
- GCP-specific compliance checks
Terraform has a type system, covering basic functionality. It allows for type constraints in the configuration. Additionally, the language has support for different types of custom conditions to validate assumptions and provide error messages.
The network
module has two list(string)
input variables, regions
and subnet_cidrs
, that are expected to be of the same length. Let's look at different ways of verifying this. First, let's get an introduction to variable validation.
-
The
subnet_cidrs
is of typelist(string)
, we would like to validate that the ranges specified are valid. We can use avalidation
block inside ourvariable
declaration. This would look like this:variable "subnet_cidrs" { description = "CIDR ranges for subnets" type = list(string) validation { condition = alltrue([for cidr in var.subnet_cidrs : can(cidrhost(cidr, 255))]) error_message = "At least one of the specified subnets were too small, or one of the CIDR range was invalid. The subnets needs to contain at least 256 IP addresses (/24 or larger)." } }
We added the
validation
block, the rest should be like before. Let's explain what's going on here:- The
condition
is a boolean expression that must be true for the validation to pass. alltrue
is a function that returns true if all elements in the list are true.for
is a for expression that iterates over the list of CIDR ranges and checks if each one is valid using thecidrhost
function.- We can refer to the variable being validated with the same syntax as before:
var.subnet_cidrs
. - The
can
function is a special function that returns true if the expression can be evaluated without errors. I.e., ifcidrhost
returns an error due to an invalid or to small IP address range,can
will return false. - If
condition
evaluates to false, the plan fails and theerror_message
will be printed.
Add the validation try to provoke a validation error by specifying a smaller IP address range (e.g., a
/25
) or specifying an invalid IP address. Make sure the code works again before moving to the next step. - The
Note
For the following steps we will write the same code in different ways, so you might want to commit (or make a copy) of your code, in order to be able to revert later. If you've already done the tasks in Track D you might want to revert your changes to the network module.
-
Depending on your use case, the best way might be to combine the variables into a structured type containing a list of objects with the properties
region
andcidr
(the type definition would belist(object({ region = string, cidr = string}))
. This would use the type system to ensure the assumptions are always correct. In Track we do this refactoring, and will not repeat it here. -
Terraform 1.9.0 (released June, 2024) introduced general expressions and cross-object references in input variable validations. This lets us refer to different variables, locals and more during validation.
a. Add a new validation to either the
regions
or thesubnet_cidrs
variable to ensure that the two lists are of equal lengths. The condition should belength(var.regions) == length(var.subnet_cidrs)
. b. Verify that the validation fails if the number of regions and CIDRs are not the same. -
For the purposes of this workshop, we can do a different refactoring to illustrate multiple validation blocks: Let's combine the
regions
andsubnet_cidrs
variables into asubnets
variable with typeobject({ regions = list(string), cidrs = list(string) })
. a. Remove the validation from the previous step, and modify the validation from the first step to work with the new variable type. Also update the module and the calling code to work with the new variable type definition. Make sureterraform plan
succeeds without modifications. b. Add a second validation block to thesubnets
variable that verifies that the two lists have the same length. Add an appropriate error message. c. Verify that the validation fails if the number of regions and CIDRs are not the same.
Checks lets you use custom conditions that will execute on every plan or apply operation. Failing checks will therefore not, however, have any effect on Terraform's execution of operations.
-
Checks are very flexible. We can write the input validations in the previous step as assertions. Transform the validation of subnet CIDRs into a check. The general syntax of a check is:
check "check_name" { // Optional data block // At least one assertion block assert { condtion = 1 == 1 // condition boolean expression error_message = "error message" } }
-
You can add data blocks in the checks to verify that a property of some resource outside the configuration or the scope of the module is correct. E.g., validate assumptions on VMs or clusters, check that resources are securely configured, or check that website responds with 200 OK after Terraform has run using the
http
provider.In the
dns_a_record
module. Write a check that uses thegoogle_dns_managed_zone
data source and verifies thatvisibility
is"private"
.Apply the changes to the configuration.
-
To see the check from the previous step, you can run
terraform destroy
to remove the resources and then apply the configuration again. Note how it gives you a warning during theplan
step, since the managed zone does not exist yet. Terraform will still provision the DNS records however, independent of the state of the checks.