This guide provides tips on how to create an effective Docker image to be used as the primary image in the Docker executor. All steps within the job, including orb commands, will execute in this image. This guide does not, however, cover how to build and publish a Docker image that will then be used in your production environment or to distribute your open source application.
For more Docker content, visit Guide to using Docker for your CI/CD pipelines.
I’d like you to ask yourself, are you sure you want to do this? There are many people who don’t want to install a few packages at run-time because it will increase their build time by 20-45 seconds. They then invest several hours figuring out how to create their own Docker image, and then more time over months and years to maintain that image. Sometimes it’s not worth the effort.
In situations where lots of packages need to be installed, source code needs to be compiled, or downloads occur over slow connections, this is the sweet spot where your own custom Docker image will shine.
Still interested? Great, let’s get started.
Writing the Dockerfile
Everything begins and ends with the Dockerfile. It is the blueprint from which a Docker image is created. Visit Docker’s Getting Started - Part 2 guide for more information about the Dockerfile.
The base image
A Dockerfile typically starts with the FROM
statement which declares the base image your new image will use. While in no way required, we strongly recommend using the CircleCI base convenience image as your base image.
FROM cimg/base:stable
The CircleCI base image serves as the base for all of the next-gen convenience images. You can learn more about all of the new images here. It’s designed from the ground up to work well on CircleCI. It has all the required tools, as well as the most common tools, that the majority of our customers need. For example, the checkout special step requires git
to be installed in the Docker image. The save_cache and persist_workspace special steps, as well as their loading equivalents, require tar
and gzip
installed. Our base image includes all of these tools and we ensure that the list of software is maintained as additional requirements arise or are removed.
This image is also fairly popular on our platform which means there are cache benefits to be obtained by basing your image on our base image. You can learn more about the CircleCI base convenience image and how to use it on its GitHub page. If you choose to use a different base image, I’d suggest at least visiting the GitHub page to see what packages we install so you can copy that list for yourself.
Installing and downloading software
RUN sudo apt-get update && sudo apt-get install -y \
bison \
llvm \
zlib1g-dev \
xz-utils && \
rm -rf /var/lib/apt/lists/*
In this example there are several best practices we can learn from:
- When using an image that doesn’t run as the root user, you need to prefix some commands with
sudo
. - In scripting scenarios, always use
apt-get
overapt
. The former is better for environments where humans aren’t around while the latter is better when tinkering away on your own local computer. - When installing with a package manager, flags like
-y
are used to assume “yes” when the package manager might have tried to ask a question. - Listing a single package per line, alphabetically. Works wonders with using git and GitHub. It makes reading the source much more enjoyable and when viewing PR diffs, it’s very clear which packages are added or removed.
- The last line deletes the Apt cache in the image. This helps with caching, but more importantly reduces the size of the image.
ENV GO_VERSION=1.14.1
RUN curl -sSL "https://golang.org/dl/go${GO_VERSION}.linux-amd64.tar.gz" | \
sudo tar -xz -C /usr/local/
This example gives us two more tips:
- Setting frequently changing strings as an environment variable makes it visually much easier to manage changes. Especially since the ENV instruction doesn’t count as a layer in Docker. This technique drastically increases in efficiency when the value is used multiple times in upcoming
RUN
instructions. - Here cURL downloads a tarball and pipes it directly into the next command, tar. This allows us to avoid saving the file to the filesystem which is faster, as well as avoiding cleaning up the tarball afterward. In situations where you can’t use this technique, don’t forget to clean up after yourself by deleting the tarball, zip package, etc with
rm
.
Caching and efficiency
The order of instructions are important for larger images. You want RUN
steps that change more frequently towards the bottom of the Dockerfile, while steps that change less often should be ordered towards the top. This is because Docker caches images by layers. Whenever a layer changes, that layer and the layers below it need to be re-cached. This behavior is explained in much more detail in Docker’s Dockerfile best practices guide.
Another efficiency item is that image layers should be as lean as possible. Files that are not needed should be deleted within the RUN
step it was created, which keeps the overall size of that layer smaller.
Maintaining the image
Creating a custom image is only half the job. Once that image exists, maintenance needs to occur. As your CI requirements change and grow, your image will need to adapt along with it.
A home for your Dockerfile
Keep your Dockerfile under version control. Some people prefer to keep this Dockerfile in the same repository as the project they need it for. Others, like myself, prefer to put it in its own repository.
I prefer the separate repo route because:
- When it’s in the same repo as the project, the CircleCI config for that project gets more complex. You don’t want to build the Docker image on every commit to the main project. That’s an expensive waste of time. To avoid this, you’ll need logic to separate when the project and the image get built.
- As a separate repo, it makes using the image for more than one project at your company or team much easier.
Keeping the Dockerfile up-to-date
When adding or removing something from a Dockerfile, create a new branch with your changes, review, and merge. Standard stuff. There are parts of a Docker image that get updated outside of your Dockerfile. How do we keep those parts updated?
Your base image is its own project and gets updated on its own schedule. For example, the CircleCI base convenience Image gets stable updates once a month. If you download and install software with a file name such as example.com/download/some-thing-4.3.x.tar.gz
, you’ll want to make sure you’re updating your image with the latest patch version available for that software. We do this with CircleCI scheduled workflows.
Publishing your custom Docker image using a scheduled workflow allows you to keep the background pieces of your image regularly updated without any manual work from you. The frequency for this depends on your personal needs. The CircleCI base image updates on the 2nd of every month. If you’re using that as a base, your own monthly scheduled workflow on the 3rd or 5th would work well.
Hosting the image
You have options on where to host your published Docker image. Where you host a Docker image is called a Docker Registry. The de facto place to host a Docker image is Docker Hub. If you’re unfamiliar with any other registry, stick to DockerHub and you’ll be fine. It’s free to use, even for a private image.
If you’re using a leading cloud provider for hosting, they likely have a Docker registry you can use and we likely have a CircleCI orb for that provider, making setup for all of this easier. Here are the major Docker registries and the corresponding orbs:
- DockerHub - by Docker: website / orb
- GCR - by Google Cloud: website / orb
- ECR - by AWS: website / orb
- ACR - by Azure: website / orb
- Artifactory - by JFrog: website / orb
Discussion and feedback
To discuss this topic some more or to ask questions, please visit our CircleCI Discuss forum. You can find me and CircleCI users just like you who are ready to discuss and help.