Have you ever run an application through your CI/CD pipeline and seen all of the tests pass, only to have the application not function as expected when you deploy it to a live target environment? This situation is very common and plagues many teams - you can’t always anticipate what will happen when your application is pushed live. Smoke tests are designed to reveal these types of failures early by running test cases that cover the critical components and functionality of the application. They also ensure that the application will function as expected in a deployed scenario. When implemented, smoke tests are often executed on every application build to verify that basic but critical functionality passes before jumping into more extensive and time consuming testing. Smoke tests help to create fast feedback loops and are very useful in the software development life cycle.

In this post I’ll demonstrate how to add smoke testing to the deployment stage of a CI/CD pipeline which will test simple aspects of the application post deployment.

Technologies used

This post will reference the following technologies:

Prerequisites

This post relies on configurations and code that are featured in my previous post Automate releases from your pipelines using Infrastructure as Code. The full source code can be found in this repo.

Smoke tests

Smoke tests are great for exposing unexpected build errors, connection errors, and validating a server’s expected response after a new release is deployed to a target environment. For example, a quick, simple smoke test could validate that an application is accessible and is responding with expected response codes like OK 200, 300, 301, 404, etc. The examples in this post will test that the deployed app responds with an OK 200 server code and will also validate that the default page content renders the expected text.

CI/CD pipelines without smoke tests

Let’s take a look at an example pipeline config that is designed to run unit tests, build, and push a Docker image to Docker Hub. It also uses infrastructure as code (Pulumi) to provision a new Google Kubernetes Engine (GKE) cluster and deploy this release to the cluster. This pipeline config example does not implement smoke tests. Please be aware that if you run this specific pipeline example, a new GKE cluster will be created and will live on until you manually run the pulumi destroy command which will terminate all the infrastructure it created.

Caution: Not terminating the infrastructure will result in unexpected costs.

version: 2.1
orbs:
  pulumi: pulumi/pulumi@1.0.1
jobs:
  build_test:
    docker:
      - image: circleci/python:3.7.2
        environment:
          PIPENV_VENV_IN_PROJECT: 'true'
    steps:
      - checkout
      - run:
          name: Install Python Dependencies
          command: |
            pipenv install --skip-lock
      - run:
          name: Run Tests
          command: |
            pipenv run pytest
  build_push_image:
    docker:
      - image: circleci/python:3.7.2
    steps:
      - checkout
      - setup_remote_docker:
          docker_layer_caching: false
      - run:
          name: Build and push Docker image
          command: |       
            pipenv install --skip-lock
            pipenv run pip install --upgrade 'setuptools<45.0.0'
            pipenv run pyinstaller -F hello_world.py
            echo 'export TAG=${CIRCLE_SHA1}' >> $BASH_ENV
            echo 'export IMAGE_NAME=orb-pulumi-gcp' >> $BASH_ENV
            source $BASH_ENV
            docker build -t $DOCKER_LOGIN/$IMAGE_NAME -t $DOCKER_LOGIN/$IMAGE_NAME:$TAG .
            echo $DOCKER_PWD | docker login -u $DOCKER_LOGIN --password-stdin
            docker push $DOCKER_LOGIN/$IMAGE_NAME
  deploy_to_gcp: 
    docker:
      - image: circleci/python:3.7.2
        environment:
          CLOUDSDK_PYTHON: '/usr/bin/python2.7'
          GOOGLE_SDK_PATH: '~/google-cloud-sdk/'
    steps:
      - checkout
      - pulumi/login:
          access-token: ${PULUMI_ACCESS_TOKEN}
      - run:
          name: Install dependencies
          command: |
            cd ~/
            sudo pip install --upgrade pip==18.0 && pip install --user -r project/reqs.txt
            curl -o gcp-cli.tar.gz https://dl.google.com/dl/cloudsdk/channels/rapid/google-cloud-sdk.tar.gz
            tar -xzvf gcp-cli.tar.gz
            echo ${GOOGLE_CLOUD_KEYS} | base64 --decode --ignore-garbage > ${HOME}/project/pulumi/gcp/gke/cicd_demo_gcp_creds.json
            ./google-cloud-sdk/install.sh  --quiet
            echo 'export PATH=$PATH:~/google-cloud-sdk/bin' >> $BASH_ENV
            source $BASH_ENV
            gcloud auth activate-service-account --key-file ${HOME}/project/pulumi/gcp/gke/cicd_demo_gcp_creds.json
      - pulumi/update:
          stack: k8s
          working_directory: ${HOME}/project/pulumi/gcp/gke/
workflows:
  build_test_deploy:
    jobs:
      - build_test
      - build_push_image:
          requires:
            - build_test
      - deploy_to_gcp:
          requires:
          - build_push_image

This pipeline deploys the new app release to a new GKE cluster, but we do not know if the application is actually up and running after all of this automation completes. How do we quickly validate that the application has been deployed and is functioning properly in this new GKE cluster? Implementing smoke tests into your CI/CD pipeline is a great way to quickly and easily validate the application’s status after deployment.

How do I write a smoke test?

The first step in writing smoke tests is to develop test cases which define the steps required to validate an application’s functionality. Developing test cases is an exercise in identifying functionality that you want to validate, and then creating scenarios to test it. In this tutorial, I’m intentionally describing a very minimal scope for testing. In this situation, my biggest concern is validating that the application is accessible after deployment and that the default page that is served renders the expected static text.

Below is an example of how I developed test cases for this smoke test. I prefer to outline and list the items I want to test because it suits my style of development. The outline shows the factors I considered when developing the smoke tests for this app:

  • What language/test framework?
    • Bash
    • smoke.sh
  • When should this test be executed?
    • After the GKE cluster has been created
  • What will be tested?
    • Test: Is the application accessible after it is deployed?
      • Expected Result: Server responds with code 200
    • Test: Does the default page render the text “Welcome to CI/CD”
      • Expected Result: TRUE
    • Test: Does the default page render the text “Version Number: “
      • Expected Results: TRUE
  • Post test actions (must occur regardless of pass or fail)
    • Write test results to standard output
    • Destroy the GKE cluster and related infrastructure
      • Run pulumi destroy

My test case outline is complete for this tutorial and clearly shows what I’m interested in testing. It can also be referred to as a test script. For this post, I will write smoke tests using a bash-based, open source smoke test framework called smoke.sh by asm89, but you can write smoke tests in what ever language or framework you desire. I picked smoke.sh because it’s an easy framework to implement and it’s open source. Now let’s explore how to express this test script using the smoke.sh framework.

Create smoke test using smoke.sh

The smoke.sh framework’s documentation demonstrates how to use it. The code below shows how I used the smoke_test file found in the test/ directory of the example code’s repo.

#!/bin/bash

. tests/smoke.sh

TIME_OUT=300
TIME_OUT_COUNT=0
PULUMI_STACK="k8s"
PULUMI_CWD="pulumi/gcp/gke/"
SMOKE_IP=$(pulumi stack --stack $PULUMI_STACK --cwd $PULUMI_CWD output app_endpoint_ip)
SMOKE_URL="http://$SMOKE_IP"

while true
do
  STATUS=$(curl -s -o /dev/null -w '%{http_code}' $SMOKE_URL)
  if [ $STATUS -eq 200 ]; then
    smoke_url_ok $SMOKE_URL
    smoke_assert_body "Welcome to CI/CD"
    smoke_assert_body "Version Number:"
    smoke_report
    echo "\n\n"
    echo 'Smoke Tests Successfully Completed.'
    echo 'Terminating the Kubernetes Cluster in 300 second...'
    sleep 300
    pulumi destroy --stack $PULUMI_STACK --cwd $PULUMI_CWD --yes
    break
  elif [[ $TIME_OUT_COUNT -gt $TIME_OUT ]]; then
    echo "Process has Timed out! Elapsed Timeout Count.. $TIME_OUT_COUNT"
    pulumi destroy --stack $PULUMI_STACK --cwd $PULUMI_CWD --yes
    exit 1
  else
    echo "Checking Status on host $SMOKE... $TIME_OUT_COUNT seconds elapsed"
    TIME_OUT_COUNT=$((TIME_OUT_COUNT+10))
  fi
  sleep 10
done

Next, I’ll explain what’s going on in this smoke_test file.

smoke_test file breakdown

Let’s start at the top of the file.

#!/bin/bash

. tests/smoke.sh

The snippet above specifies the Bash binary to use and also specifies the file path to the core smoke.sh framework to import/include in the smoke_test script.

TIME_OUT=300
TIME_OUT_COUNT=0
PULUMI_STACK="k8s"
PULUMI_CWD="pulumi/gcp/gke/"
SMOKE_IP=$(pulumi stack --stack $PULUMI_STACK --cwd $PULUMI_CWD output app_endpoint_ip)
SMOKE_URL="http://$SMOKE_IP"

The snippet above defines environment variables that will be used throughout the smoke_test script. The list of environment variables below explain their purpose:

  • PULUMI_STACK="k8s" - Used by pulumi to specify the pulumi app stack.
  • PULUMI_CWD="pulumi/gcp/gke/" - The path to the pulumi infrastructure code.
  • SMOKE_IP=$(pulumi stack --stack $PULUMI_STACK --cwd $PULUMI_CWD output app_endpoint_ip) - The Pulumi command used to retrieve the public IP address of the application on the GKE cluster. This variable is referenced throughout the script.
  • SMOKE_URL="http://$SMOKE_IP" - Specifies the url endpoint of the application on the GKE cluster.
while true
do
  STATUS=$(curl -s -o /dev/null -w '%{http_code}' $SMOKE_URL)
  if [ $STATUS -eq 200 ]; then
    smoke_url_ok $SMOKE_URL
    smoke_assert_body "Welcome to CI/CD"
    smoke_assert_body "Version Number:"
    smoke_report
    echo "\n\n"
    echo 'Smoke Tests Successfully Completed.'
    echo 'Terminating the Kubernetes Cluster in 300 second...'
    sleep 300
    pulumi destroy --stack $PULUMI_STACK --cwd $PULUMI_CWD --yes
    break
  elif [[ $TIME_OUT_COUNT -gt $TIME_OUT ]]; then
    echo "Process has Timed out! Elapsed Timeout Count.. $TIME_OUT_COUNT"
    pulumi destroy --stack $PULUMI_STACK --cwd $PULUMI_CWD --yes
    exit 1
  else
    echo "Checking Status on host $SMOKE... $TIME_OUT_COUNT seconds elapsed"
    TIME_OUT_COUNT=$((TIME_OUT_COUNT+10))
  fi
  sleep 10
done

The snippet above is where all the magic happens. It’s a while loop that executes until a condition is true or the script exits. In this case, the loop uses a curl command to test if the application returns an OK 200 response code. Now since this pipeline is creating a brand new GKE cluster from scratch, there are transactions occurring in the Google Cloud Platform that take time to complete before we begin smoke testing. The first thing that needs to occur is the GKE cluster and application service must be up and running. The $STATUS variable is populated with the results of the curl requests then tested for the value of 200. Otherwise, the loop increments the $TIME_OUT_COUNT variable by 10 seconds, then waits for 10 seconds to repeat the curl request until the application is responding. Once the cluster and app are up, running, and responding, the STATUS variable will produce a 200 response code and the remainder of the tests will proceed.

The smoke_assert_body "Welcome to CI/CD" and smoke_assert_body "Version Number: " statements are where I test that the welcome and version number texts are being rendered on the webpage being called. If the result is false, the test will fail which will fail the pipeline. If the result is true, then the application will return a 200 response code and our text tests will result in TRUE. This will then result in our smoke test passing and finally executing the pulumi destroy command which terminates all of the infrastructure created for this test case. Since there is no further need for this cluster it will terminate all the infrastructure created in this test.

This loop also has an elif (else if) statement that checks to see if the application has exceeded the $TIME_OUT value. The elif statement is an example of exception handling which enables us to control what happens when unexpected results occur. If the $TIME_OUT_COUNT value exceeds the TIME_OUT value then the pulumi destroy command is executed and terminates the newly created infrastructure and the exit 1 command fails your pipeline build process. Regardless of test results, the GKE cluster will be terminated because there really isn’t a need for this infrastructure to exist outside of testing.

Adding smoke tests to pipelines

I’ve explained the smoke test example and my process for developing the test case. Now it’s time to integrate it into the CI/CD pipeline configuration above. We’ll add a new run step below the pulumi/update step of the deploy_to_gcp job:

      ...
      - run:
          name: Run Smoke Test against GKE
          command: |
            echo 'Initializing Smoke Tests on the GKE Cluster'
            ./tests/smoke_test
            echo "GKE Cluster Tested & Destroyed"
      ...

The snippet above demonstrates how to integrate and execute the smoke_test script into an existing CI/CD pipeline. Adding this new run block to the pipeline will now ensure that every pipeline build will test the application on a live GKE cluster and provide a validation that the application passed all test cases. You can be confident that the specific release will perform nominally when deployed to the tested target environment which in this case, is a Google Kubernetes cluster.

Wrapping up

In summary, I’ve discussed and demonstrated the advantages of leveraging smoke tests and infrastructure as code within CI/CD pipelines to test builds in their target deployment environments. Testing an application in its target environment provides valuable insight into how it will behave when it’s deployed to that same target environment. Smoke testing implemented in CI/CD pipelines adds another layer of confidence in application builds.

If you have any questions, comments, or feedback please feel free to ping me on Twitter @punkdata.

Thanks for reading!