EngineeringJul 27, 20185 min read

Wide World of Workflows: How we build our Docker convenience images

Community & Partner Engineer, CircleCI

Our workflows blog series has been exploring some of workflows’ major feature sets—job orchestration, multi-executor support, and control flow—by spotlighting open-source examples from CircleCI customers like Artsy, DataDog, Facebook (React Native), Google (Google Cloud), and Mapbox.

To wrap things up, we wanted to turn the spotlight on, well, ourselves, and talk about how we’re using workflows internally. Most of our codebase is not open source, but our circleci-images repository is—and it’s a great example of what you can accomplish with the entire workflows toolset at your disposal.

What’s circleci-images?

We use circleci-images to build and deploy our Docker convenience images, which are used thousands of times every day.

With so many customers depending on these images, we’ve constructed an image-building pipeline that is efficient and reliable, using features like fan-in/fan-out, jobs with different execution environments, cron scheduling, branch-based filters, and multiple workflows.

Our config.yml file also makes extensive use of YAML anchors, and, as part of our deployment process, we incorporate multiple GitHub repositories besides the circleci-images repo itself, and ultimately push images to separate staging and production organizations on Docker Hub.

Let’s dive in and explore how all these disparate parts come together: how does what starts as source code in the circleci-images repo end up as Docker images deployed to 17 different Docker Hub repositories, each with its own specific set of image variants? And, more importantly, why did we build it this way?

The circleci-images journey

Following code on its path through a continuous integration and delivery pipeline typically starts with a single commit. However, with scheduled workflows, things aren’t always so simple. In that vein, our circleci-images config.yml starts by addressing two distinct use cases:
1) We want to build and deploy nightly images.

Why? Because building Docker images typically involved pulling in a lot of upstream dependencies, and our images are certainly no exception. We install a wide range of languages, frameworks, utilities, and applications in our Docker images depending on the image and variant, from common CLI tools like curl, git, and jq; to browser-testing packages like Chromedriver or PhantomJS; to applications like Docker, Firefox, and Chrome. If there are new patches or stable releases of any of this software, we want to update our convenience images as quickly as possible, so we can deliver the safest, most stable platform for our users. Scheduled workflows makes this possible.
2) We want to build, test, and deploy images on every commit.

This is the more typical scenario: nightly builds are important, but so is getting immediate feedback on code changes.

To accommodate both these use cases, we use two distinct workflows in our config.yml: a nightly build_test_deploy workflow that runs every day at midnight UTC, but only on circleci-images’ master branch—

workflows:
  version: 2
  build_test_deploy:
    triggers:
      - schedule:
          cron: "0 0 * * *"
          filters:
            branches:
              only:
                - master

—and a regular commit workflow that handles our control flow for each new commit. With only a couple exceptions, however, these workflows run the same set of jobs, which is an important feature of multiple-workflow configurations: define a job once, and you can reuse it in as many workflows as you like, helping to keep your config.yml short and sweet.

Now that we’ve addressed our nightly workflow scenario, let’s move along with following a code change from commit to deploy. On a given commit, we run a couple of initial jobs to programmatically generate Dockerfiles for all the image variants we will be building, so if there are issues with the shell scripts and Make jobs responsible for that work, the workflow stops right there.

If everything goes well, we fan-out for our next set of jobs, which do the actual image-building work. These jobs take advantage of our branch-based filtering feature, as we only want to build and push Docker images on commits to the staging or master branches. Again, we use YAML anchors here to avoid having to manually repeat the same workflows filters for job after job:

workflow_filters: &workflow_filters
  requires:
    - refresh_tools_cache
  filters:
    branches:
      only:
        - master
        - production
        - parallel

Each image-building job looks more-or-less like this:

  publish_image: &publish_image
    machine: true
    working_directory: ~/circleci-bundles
    steps:
      - checkout
      - run:
          name: Docker Login
          command: docker login -u $DOCKER_USER -p $DOCKER_PASS
      - run:
          name: Build and Publish Images
          command: |
            export COMPUTED_ORG=ccistaging
            if [[ "$CIRCLE_BRANCH" == "production" ]]
            then
              export COMPUTED_ORG=circleci
            fi
            export NEW_ORG=${NEW_ORG:-$COMPUTED_ORG}

            make -j $PLATFORM/publish_images
      - store_artifacts:
          path: "."
          destination: circleci-bundles

This YAML anchor lets us reuse our image-building logic, simply passing in a different environment variable to specify which image we’re building in a particular job:

 publish_node:
    <<: *publish_image
    environment:
      - PLATFORM: node

Unlike a traditional fan-out/fan-in workflow, where one might fan out to run tests and fan back in for a discrete deployment job or set of jobs, our deployment logic here is executed within each individual image-building job (as handled via Make targets). Although we host our convenience images on Docker Hub, we build them on CircleCI—we find it to be much faster than using Docker Hub to build every variant of every image.

As you can see with some of our more recently updated Docker Hub repositories—take [circleci/golang](https://hub.docker.com/r/circleci/golang) as an example—we use Docker Hub to build only a single example image per repository, which gives us the integrated Dockerfile and README support provided by Docker Hub’s Automated Builds feature.

To build all these images, one could use either the Remote Docker Environment or the machine executor. If our image-building jobs had some specialized logic that required particular dependencies or frameworks or languages to be installed, or a particular set of commands to be run before building, then Remote Docker might be the best choice here, as it would allow us to use a Docker image as our primary build container and shell out to the Remote Docker Environment for Docker-related commands.

However, since these jobs consist entirely of Docker-related commands, it’s easier to just use the machine executor. This is a great illustration of the power of multi-platform workflows—picking different execution environments for each job offers you a high level of customization that results in a more streamlined, optimized build/test/deploy process.

Want to dig deeper? Because all the repositories used as part of our convenience image-building process are open-source, you can see exactly how it all works:

circleci.com/gh/circleci/circleci-images see the repository building on CircleCI
github.com/circleci/circleci-images the source code on GitHub
github.com/circleci-public/circleci-dockerfiles houses the Dockerfiles for each variant of every convenience image we build
github.com/circleci-public/example-images master repository responsible for triggering all of our Docker Hub Automated Builds
hub.docker.com/r/circleci our Docker Hub production organization
hub.docker.com/r/ccistaging our Docker Hub staging organization (please don’t use these images in your projects!)