Why are Deterministic Builds Important?

Reproducibility and reliability.

The most common thing a customer will say in a support ticket is that their builds are suddenly failing even though “nothing has changed” on their end. This is almost never true.

In this post, I want to talk about deterministic builds. The idea here is to reduce as many changing parts as possible within a build. This means fewer mysterious failing builds, fewer support tickets (for you and us), and perhaps identically reproducing that accidentally deleted binary by simply re-running the build.
We all know testing is important. However, the more variables at play within a build, the less confident you can be in your tests. With a moving target, tests that pass in one scenario may no longer pass 3 commits later, even with the “same” dependencies.

By ensuring your builds are truly deterministic, you’ll reduce the changing parts in your build in order to increase confidence in your test suite and increase reproducibility.

In the context of continuous integration (CI), there are many ways a developer or organization might try to optimize their builds. There are some well-known optimizations: decreasing build times or reducing flaky tests. However, these tactics miss the bigger picture. Optimizations like these are focused on a single build– taking into consideration just one part of the landscape. It’s also important to consider past and future builds, and their relationship to each other.

Think of your builds as a continuum: a series of interconnected parts over time.

By taking proactive measures now, you prevent problems in the future, and preserve the integrity of previous builds.

What is a Deterministic Build?

A deterministic build is one that can be run “live” at commit time, tomorrow, and even next month and end with the exact same results. You can imagine taking a “fingerprint” of a build from when it first ran, and doing so again on a re-run. They should match exactly.

What do we mean by the same results or fingerprint? A build should be able to be re-run in the future with the same tests that failed the first time failing again. The tests that passed the first time should pass again. Does the build produce artifacts? These artifacts should also, in theory, be exactly the same. This typically comes easy with log files or screenshot artifacts but the golden goose are binaries. Being able to run a build that produced a binary of version X.Y of a software, 3 months later, and producing that exact same binary is the height of what you can accomplish with a deterministic build.

How Do I Make My Builds More Reproducible?

Version Pinning

This might be the single easiest change you can make. Declare the most specific version of dependencies as you can. For example, with npm it’s better to run npm install react@16.0.0 instead of npm install react@16.0 or npm install react@">=16".

The same goes for Docker images. It’s better to do:

version: 2
jobs:
  build:
    docker:
      - image: golang:1.9.3
    steps:
...

than it would be to do:

version: 2
jobs:
  build:
    docker:
      - image: golang:1.9
    steps:
...

In both scenarios, leaving off the bugfix/patch number of the version (the third integer) allows the version of the dependencies you’re using to change out from under you. Today, - image: golang:1.9 might run version 1.9.3 of the Docker image but a build next week might run version 1.9.5 of the Docker image.

Of course, there’s always a catch. Including the bugfix release number means if there’s, well, a bugfix (or security patch), your software won’t automatically start using it. You have to decide what level of maintenance you’re comfortable with. 1.9.3 is the most deterministic and least “cutting edge” while 1 is the least deterministic and most “cutting edge”.

Caching

Yes, caching. Most people would consider caching a technique to decrease build time, which it is, but it can also used to increase the reliability of a build.

How? Sometimes caching is used to store a compiled binary or data that was generated in a CPU-intensive way. In that context, we save time, not necessarily gain reliability. Caching increases reliability when a build retrieves several files over the network. How? What if, temporarily, the network goes down? DNS fails? Once we have those assets cached, we don’t need to worry about those issues failing an otherwise good build. If npm’s repository goes down when we have all of our dependencies cached, theoretically we’re good. The build will continue without a problem.

Caching also helps with reproducibility. What if an asset downloaded from somewhere on the Internet is no longer available 2 months from now because its owner didn’t pay their hosting bill? What if a dependency was released and accidentally reused a version tag (Canonical did this recently with Ubuntu 17.10)? Or worse, if this happened maliciously? Caching ensures that the same version, and more specifically, the same code that a build ran with at the beginning is what it will run with again.

Summary

A build may never be 100% fully deterministic. For security reasons (security/patch release) or some company policy or another, a build may have to include some “moving parts”. The real point behind this post is making sure your build is as deterministic as it can be, for your use case. This will ensure that your builds are optimized for reliability and reproducibility.