In this series, I’ll demonstrate how to get started with infrastructure as code (IaC). My goal is to help developers build a strong understanding of this concept through tutorials and code examples. Here are the topics this series will cover:

Intro to IaC

IaC is an integral part of modern continuous integration pipelines. It is the process of managing and provisioning cloud and IT resources via machine readable definition files. IaC enables organizations to create, manage, and destroy compute resources using modern DevOps tools by statically defining and declaring these resources in code.

In this post, I will discuss how to use HashiCorp’s Terraform to provision, deploy, and destroy infrastructure resources. Before we start, we’ll first need to create accounts in target cloud providers and services such as Google Cloud and Terraform Cloud. Once you have completed the Prerequisites below, we’ll start by learning how to use Terraform to create a new Google Kubernetes Engine (GKE) cluster

Prerequisites

Before you get started, you’ll need to have these things:

This post works with the code in the Part 1 folder of this repo. Before we get to that, let’s briefly look at creating GCP credentials and then Terraform.

Creating GCP project credentials

You will need to create GCP credentials in order to perform administrative actions using IaC tooling.

  • Go to the create service account key page
  • Select the default service account or create a new one
  • Select JSON as the key type
  • Click Create
  • Save this JSON file in the ~/.config/gcloud/ directory. You can rename the file to anything you like.

HashiCorp Terraform

HashiCorp Terraform is a an open source tool for building, changing, and versioning infrastructure safely and efficiently. Terraform can manage existing service providers as well as custom in-house solutions.

Terraform uses configuration files to describe the components needed to run a single application or your entire data center. It generates an execution plan describing what it will do to reach the desired state, and then executes it to build the described infrastructure. As the configuration changes, Terraform is able to determine what changed and create incremental execution plans which can then be applied to update infrastructure resources.

The infrastructure Terraform can manage includes low-level components such as compute instances, storage, and networking, as well as high-level components such as DNS entries, SaaS features, etc.

Terraform providers

Let’s start with provisioning some resources in GCP using Terraform code. We want to be able to write some Terraform code that will define and create a new GKE cluster for us to use in part 2 of the series. Terraform is used to create, manage, and update infrastructure resources such as physical machines, VMs, network switches, containers, and more. Almost any infrastructure type can be represented as a resource in Terraform.

A provider is responsible for understanding API interactions and exposing resources. Providers generally are an IaaS (e.g. Alibaba Cloud, AWS, GCP, Microsoft Azure, OpenStack), PaaS (e.g. Heroku), or SaaS services (e.g. Terraform Cloud, DNSimple, Cloudflare).

To create a new GKE cluster, we need to rely on the GCP provider for our interactions with GCP. Once the provider is defined and configured, we’ll have the ability to build and control Terraform resources on GCP.

Terraform resources

Resources are the most important element in the Terraform language. Each resource block describes one or more infrastructure objects, such as virtual networks, compute instances, or higher-level components such as DNS records. A resource block declares a resource of a given type (e.g., google_container_cluster) with a given local name (e.g., “web”). The name is used to refer to this resource from elsewhere in the same Terraform module, but has no significance outside of the scope of a module.

Terraform code

Now that you have a better understanding of Terraform providers and resources, let’s dig into some code. Terraform code is maintained within directories. Since we are using the CLI tool, you must execute commands from within the root directories where the code is located. For part 1 of the series, the Terraform code we’ll be using is located in the part01/iac_gke_cluster folder here. Within this directory you will see these files:

  • providers.tf
  • variables.tf
  • main.tf
  • output.tf

These files represent the GCP resources infrastructure that we’re going to create. These are what Terraform processes. You can place all of the Terraform code into one file, but that tends to get a bit harder to manage once the syntax grows in volume. Most Terraform devs create a separate file for every element. Let’s do a quick break down of each file and discuss the critical elements of each.

Breakdown: providers.tf

The provider.tf file is where we define the cloud provider we’ll be using. We’ll be using the google_container_cluster provider. The contents of the provider.tf file is shown below.

provider "google" {
  # version     = "2.7.0"
  credentials = file(var.credentials)
  project     = var.project
  region      = var.region
}

The above code block has parameters in closure { } blocks. The credentials block specifies the file path to the GCP credential’s JSON file that you created earlier. Notice that the values for the parameters are are prefixed with var. The var prefix defines the usage of Terraform Input Variables, which serve as parameters for a Terraform module. This allows aspects of the module to be customized without altering the module’s own source code, and allows modules to be shared between different configurations. When you declare variables in the root module of your configuration, you can set their values using CLI options and environment variables. When you declare them in child modules, the calling module will pass values in the module block.

Breakdown: variables.tf

The variables.tf file specifies all the input variables that this Terraform project uses.

variable "project" {
  default = "cicd-workshops"
}

variable "region" {
  default = "us-east1"
}

variable "zone" {
  default = "us-east1-d"
}

variable "cluster" {
  default = "cicd-workshops"
}

variable "credentials" {
  default = "~/.ssh/cicd_demo_gcp_creds.json"
}

variable "kubernetes_min_ver" {
  default = "latest"
}

variable "kubernetes_max_ver" {
  default = "latest"
}

The variables defined in the above file will be used throughout this project. All of these variables have default values, but these values can be changed by defining them with the CLI when executing Terraform code. These variables add much needed flexibility to the code and enables valuable code reusability.

Breakdown: main.tf

The main.tf file defines the bulk of our GKE cluster parameters.

terraform {
  required_version = "~>0.12"
  backend "remote" {
    organization = "datapunks"
    workspaces {
      name = "iac_gke_cluster"
    }
  }
}

resource "google_container_cluster" "primary" {
  name               = var.cluster
  location           = var.zone
  initial_node_count = 3

  master_auth {
    username = ""
    password = ""

    client_certificate_config {
      issue_client_certificate = false
    }
  }

  node_config {
    machine_type = var.machine_type
    oauth_scopes = [
      "https://www.googleapis.com/auth/logging.write",
      "https://www.googleapis.com/auth/monitoring",
    ]

    metadata = {
      disable-legacy-endpoints = "true"
    }

    labels = {
      app = var.app_name
    }

    tags = ["app", var.app_name]
  }

  timeouts {
    create = "30m"
    update = "40m"
  }
}

Let’s breakdown each element of the main.tf file starting with the terraform block. This specifies the type of Terraform backend. A “backend” in Terraform determines how state is loaded and how an operation such as apply is executed. This abstraction enables non-local file state storage, remote execution, etc. In this code block, we’re using the remote backend which uses the Terraform Cloud and is connected to the iac_gke_cluster workspace you created in the prerequisites section.

terraform {
  required_version = "~>0.12"
  backend "remote" {
    organization = "datapunks"
    workspaces {
      name = "iac_gke_cluster"
    }
  }
}

The next code block defines the GKE Cluster that we’re going to create. We’re also using some of the variables defined in variables.tf. The resource block has many parameters which are used to provision and configure the GKE Cluster on GCP. The important parameters here are the name, location, and Initial_node_count, which specifies the initial total of compute resources or virtual machines that will comprise this new cluster. We’re starting with three compute nodes for this cluster.

resource "google_container_cluster" "primary" {
  name               = var.cluster
  location           = var.zone
  initial_node_count = 3

  master_auth {
    username = ""
    password = ""

    client_certificate_config {
      issue_client_certificate = false
    }
  }

  node_config {
    machine_type = var.machine_type
    oauth_scopes = [
      "https://www.googleapis.com/auth/logging.write",
      "https://www.googleapis.com/auth/monitoring",
    ]

    metadata = {
      disable-legacy-endpoints = "true"
    }

    labels = {
      app = var.app_name
    }

    tags = ["app", var.app_name]
  }

  timeouts {
    create = "30m"
    update = "40m"
  }
}

Breakdown: output.tf

Terraform uses a concept they call output values. These return values of a Terraform module and provide a child module with outputs to expose a subset of its resource attributes to a parent module or print certain values in the CLI output after running terraform apply. The output.tf blocks below output values to readout values like cluster name, cluster endpoint, as well as sensitive data which is specified with the sensitive parameter.


output "cluster" {
  value = google_container_cluster.primary.name
}

output "host" {
  value     = google_container_cluster.primary.endpoint
  sensitive = true
}

output "cluster_ca_certificate" {
  value     = base64decode(google_container_cluster.primary.master_auth.0.cluster_ca_certificate)
  sensitive = true
}

output "username" {
  value     = google_container_cluster.primary.master_auth.0.username
  sensitive = true
}

output "password" {
  value     = google_container_cluster.primary.master_auth.0.password
  sensitive = true
}

Initializing Terraform

Now that we’ve covered our Terraform project and syntax, it’s time to start provisioning our GKE cluster using Terraform. Change directory into the part01/iac_gke_cluster folder:

cd part01/iac_gke_cluster

While in part01/iac_gke_cluster, run this command:

terrform init

This was my output.

root@d9ce721293e2:~/project/terraform/gcp/compute# terraform init

Initializing the backend...

Initializing provider plugins...
- Checking for available provider plugins...
- Downloading plugin for provider "google" (hashicorp/google) 3.10.0...

* provider.google: version = "~> 3.10"

Terraform has been successfully initialized!

Previewing with Terraform

Terraform has a command that allows you to dry run and validate your Terraform code without actually executing anything. The command is called terraform plan. This command also graphs all the actions and changes that Terraform will execute against your existing infrastructure. In the terminal run:

terraform plan

This was my output.

An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
  + create

Terraform will perform the following actions:

  # google_container_cluster.primary will be created
  + resource "google_container_cluster" "primary" {
      + additional_zones            = (known after apply)
      + cluster_ipv4_cidr           = (known after apply)
      + default_max_pods_per_node   = (known after apply)
      + enable_binary_authorization = false
      + enable_intranode_visibility = (known after apply)
      + enable_kubernetes_alpha     = false
      + enable_legacy_abac          = false
      + enable_shielded_nodes       = false
      + enable_tpu                  = (known after apply)
      + endpoint                    = (known after apply)
      + id                          = (known after apply)
      + initial_node_count          = 3
      + instance_group_urls         = (known after apply)
      + label_fingerprint           = (known after apply)
      + location                    = "us-east1-d"
  }....
Plan: 1 to add, 0 to change, 0 to destroy.  

Terraform is going to create new GCP resources for you based on the code in the main.tf file.

Terraform apply

Now you’re ready to create the new infrastructure and deploy the application. Run this command in the terminal:

terraform apply

Terraform will prompt you to confirm your command. Type yes and hit enter.

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value: yes

Terraform will build your new GKE cluster on GCP.

Note: It will take 3-5 minutes for the cluster to complete. It’s not an instant process because the back-end systems are provisioning and bringing things online.

After my cluster was completed, this was my output.

Apply complete! Resources: 1 added, 0 changed, 0 destroyed.

Outputs:

cluster = cicd-workshops
cluster_ca_certificate = <sensitive>
host = <sensitive>
password = <sensitive>
username = <sensitive>

The new GKE cluster has been created and the Outputs results are displayed. Notice the output values that were marked sensitive are masked in the results with <sensitive> tags. This ensures sensitive data is protected, but available when needed.

Terraform destroy

Now that you have proof that your GKE cluster has been successfully created, run the terraform destroy command to destroy the assets that you created in this tutorial. You can leave it up and running, but be aware that there is a cost associated with any assets running on GCP and you will be liable for those costs. Google gives a generous $300 credit for its free trial sign-up, but you could easily eat through that if you leave assets running. It’s up to you, but running terraform destroy will terminate any running assets.

Run this command to destroy the GKE cluster:

terraform destroy

Summary

Congratulations! You’ve just completed part 1 of this series and leveled up your experience by provisioning and deploying a Kubernetes cluster to GCP using IaC and Terraform.

Continue to part 2 of the tutorial where you’ll learn how to build a Docker image for an application, push that image to a repository, and then use Terraform to deploy that image as a container to GKE using Terraform.

The following resources will help you expand your knowledge from here: