Don’t repeat yourself is a well-known principle of software development. Wikipedia defines it as follows: “Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.” In this post, I will focus on how this principle relates to your CI suite and explore areas where we can eliminate duplication. Let’s discuss what pitfalls appear in a typical situation and how we can avoid them.

The situation

It’s quite common to build the CI/CD workflow as follows:

  1. Check out the source code from your git repository
  2. Build an artifact (e.g., a Docker image)
  3. Run tests
  4. Push the artifact to a registry
  5. Deploy the artifact to the respective environment

It looks quite concise, but most of those steps require some pre-configuration and come with their own dependencies.

Here are some examples of considerations that can make this setup more complicated:

  • In order to build an artifact, we need a specific version of Docker and a specific version of npm. “Specific” is key here because we will not be satisfied with an arbitrary version of a system (or any other) dependency. A version that is too old might not support the exact functionality we need. A version that is too new might not be backward-compatible or may break it.
  • In order to run tests, we need a specific version of docker-compose and a utility to upload the test results to a third-party service for future analysis.
  • Chances are that the Docker images are stored in a private registry (like ECR on AWS or GCR on GCP) so we need the respective Command Line Interface (CLI) (the correct version) to push them.
  • In order to carry out the deployment itself, we need to call the CLI of the target platform (e.g. kubectl or awscli). This requires another CLI (and the correct version).
  • Finally, we might need custom tools that let us parse and process the cluster/platform metadata relevant to the deployment. For example, “deployment” jobs of the workflow might invoke some logic that defines the deployment target based on the name of the git branch and performs many useful pre-sets (e.g., a verbose name of the build, the name of the cloud credentials to use, etc.).

So, our CI/CD workflow also needs to include pre-configuration of the CI environment and making our custom utility scripts available inside that environment.

The problem

Let’s assume our application is a single-page web app with, say, Python/Django on the back end and AngularJS on the front end. Besides an entirely different technology stack, chances are good that the two components are developed by two different teams. In general, they don’t share much beyond RESTful interface and the fact that they are two sides of the same product.

It’s quite natural that the two components live in different git repositories. This means that we have to have two separate CircleCI configuration files and two sets of our utility scripts. Unlike the application code, those tend to be identical.

As the system evolves, new applications will be added to the product, especially if you decide to go the microservices route. The CI suite would have to be copied for every service. This amount of duplication is problematic for many reasons. It benefits us to reduce it as much as possible.

How CircleCI can help

A prominent feature of CircleCI 2.0 is that it lets us run CI/CD jobs with our own Docker images. Let’s go ahead and pack our CI suite into an image that will be reused by all of our applications. It will let us build a single artifact once and reuse it across as many applications as necessary by simply referring it in .circle/config.yml.

Let’s do it step-by-step

For the examples below to be more tangible, we will make concrete assumptions regarding the technologies and tools. However, the approach itself is applicable to virtually anything that can be run with Docker and CircleCI.

Step 1

First, let’s pack all of the third-party software we need for our CI workflow into a Docker image. For example, we might need npm to build our Angular app, awscli to run cluster operations and, a utility called shyaml to facilitate processing of docker-compose config. A Dockerfile like the following lets us package it all together.

FROM python:3.6

RUN curl -sL https://deb.nodesource.com/setup_9.x | sudo -E bash && \
    sudo apt-get install -y nodejs=9.3.0-1nodesource1 && \
    sudo apt-get clean && \
    npm install -g -g npm@5.6.0 && \
    npm install -g webpack@3.1.0
 
RUN pip install awscli==1.15.16 shyaml==0.3.4

Step 2

Now we can push the image to Docker Hub and refer to it in the CircleCI configs (.circleci/config.yml) simply as:

docker:
- image: your-docker-registry/ci:0.1.0

Excellent! Now we don’t have to waste time installing all the third-party dependencies. Every time someone triggers a build on CircleCI, it will start with everything in place.

SECURITY NOTE DO NOT include anything sensitive (like API credentials or SSH keys) in your Docker image. A much safer (and still easy) approach to dealing with secrets would be to feed them in from ENV vars upon each CI job.

Step 3

What’s left now is the CI “functions”, i.e. the code that we have in the run blocks of the CircleCI config file.

For example, the following config file builds a Docker image with docker-compose in a way that injects metadata related to the build. This information can be useful in run-time in order to make references to the state of the code in application logs or crash reports.

    - run:
        name: Build Docker image
        command: |
          DOCKER_CONTEXT=`docker-compose config |
          shyaml get-value services.$SERVICE_NAME.build.context`

          echo "{\"build_num\": \"$CIRCLE_BUILD_NUM\", \"release\": \
            \"$CIRCLE_TAG\"}" > ${DOCKER_CONTEXT}/build_meta.json
        
          docker-compose build $SERVICE_NAME

This is not something you would want to copy-paste from repository to repository. How do you reuse this across repositories as much as possible?

Let’s move code like this into a separate file (e.g. in a folder called ./ci-scripts) and include it in our CI Docker image by adding this extra line:

COPY ./ci-scripts /opt/ci-scripts

Then you can rewrite all the CircleCI configs like this:

- run:
    name: Build Docker image
    command: /opt/ci-scripts/build-docker-image.sh

Step 4

Remember, the goal was to reuse the “CI” code. The last step is to move ci-scripts as well as the “CI” Dockerfile to a separate repository (your “ops” repository, if you have one, or create a new one).

Now, if somebody on the team has questions or ideas regarding the CI internals, they know that there is a single place where everything is located. Additionally, our CI-related code is not spread out across multiple repositories any longer.

Summary

The described technique provides you with the following benefits:

  • You have no “copies” of identical CI scripts/configs. In a nutshell, your CI suite adheres to the DRY principle.
  • You have a smoother way to maintain the CI tooling because of unification and consolidation.
  • You have a clear CI workflow with little to no discrepancy between the applications making it easier for everyone on the team to understand.
  • You have no need to install third-party dependencies (like those CLIs) on every build and your team will appreciate shorter build time.

About Gennady: Gennady has been writing code, deploying code, and coaching those who write and deploy code for over a decade. He’s fascinated about elegant solutions of non-trivial problems, intelligent design of sophisticated things, and the beauty of pragmatic simplicity.