Make no mistake, speed is important for DevOps. But speeds means little if it’s not accompanied by reliable and consistent quality. High velocity is great, but high confidence is just as vital. Taken together, the two qualities create what’s known as high-performance DevOps.
The challenge is knowing how to gauge your DevOps team’s performance, since speed only tells part of the story. What does a high-performing team actually look like, and how do you know if your team is doing well, compared to other DevOps teams? What does “fast” look like?
In our recent report, The Data-Driven Case for CI: What 30 Million Workflows Reveal About DevOps in Practice, we dove into the data and metrics that reveal DevOps performance. In this blog post and three to follow, we’ll zoom in on four of the key metrics considered to be industry standards for describing DevOps performance.
We also share our run data that supports the key metrics already familiar to DevOps professionals who follow sources like the 2019 State of DevOPs report. Thanks to the information we’ve gathered from 30 million CircleCI workflows, we can say with confidence that the data supports these metrics as benchmarks for success – and that continuous integration (CI), which lets teams automate the time-consuming manual steps involved in software delivery, makes the difference.
It won’t surprise you that we’re big fans of CI, and cheerleaders of DevOps teams that adopt the CI approach. According to Puppet’s 2016 State of DevOps Report, high-performing DevOps teams to outperform other teams with 200 times more frequent deployments, 24 times faster recovery from failure, and 2,555 times shorter lead times.
CI’s impact on lead time
In this post, we’ll address lead time: the length of time it takes for a workflow to run, from first trigger to completion. By “lead time,” we don’t necessarily mean “deploy to production”; we mean getting to the desired end-state of whatever your workflow is meant to accomplish. The lead time measured is based on how long it takes for any particular workflow to run.
Lead time is about how quickly you can get a signal. If you want a short lead time, you need to maximize automation as much as possible. This is where CI comes in handy: If you automate as much of the pipeline as possible, you can shrink deployment timelines that traditionally go from weeks and months to hours – even minutes.
To understand how observed development behavior compares with industry standards, we looked at CircleCI data from over 30 million workflows ran between June 1 and August 30, 2019. The workflows represent:
- 1.6 million jobs runs per day
- More than 40,000 orgs
- Over 150,000 projects
Here’s what we found:
- The minimum recorded lead time was 2.1 seconds
- The maximum lead time was 3.3 days
- 80 percent of the 30 million workflows finished in under 10 minutes
- The 50th percentile of lead times was 3 minutes and 27 seconds
- The 95th percentile was 28 minutes
What’s the takeaway from our data?
There’s no one-size-fits all answer: It depends on your workflows. Workflows that finish in 2.1 seconds don’t do much: odds are they’re about sending notifications, or printing and returning a message and returning exit 0. The maximum lead time of 3.3 days that we noticed likely shows a lot of testing, like regression suites, integration suites, and perhaps cross-compiling for multiple platforms.
As we said, lead time is dependent on what you’re trying to accomplish. For some teams and some workflows, 28 minutes (the 95th percentile) would be too long to wait for a signal; in other cases, completing a workflow in 28 minutes might be a big success. The complexity of your tests, the type of software you’re building, and how deeply tests are integrated, all factor into lead time.
If you can add in any kind of optimization that reduces workflow time, you’re going in the right direction.
Is there such a thing as a workflow that runs too long? Or is too slow?
As we noted right off the bat, speed is important, when it’s matched with quality. But we don’t mean to say that DevOps organizations with workflows that run three-plus days are slow. The org could be one with hundreds of projects that are dependent on each other – something that’s not reflected in our data.
But while lead time can vary for many reasons, this is a massive reduction in lead time when compared to teams that are not using CI. This is why teams use continuous integration to automate building and testing, and it’s how teams using CI are able to reduce lead time by four orders of magnitude over their peers.
The incremental value of CI adoption
We don’t believe in a universal standard for workflow lead time. Teams shouldn’t worry about hitting universal benchmarks – they should focus on looking at internal opportunities to reduce workflow length.
Adopting CI doesn’t happen overnight, but the good news is that even a little bit helps (and you don’t have to do it perfectly). It’s a great place to start: simply adopting CI principles puts you on the road to improved performance.
In the next three blog posts on our workflow data, we’ll share insights on more key DevOps metrics: deployment frequency, mean time to recovery, and change fail percentage.