DORA is right: AI is an amplifier, for better or worse

The 2025 DORA report just surveyed nearly 5,000 technology professionals and delivered a verdict that should reshape how you think about AI investment: AI doesn’t create organizational excellence; it amplifies what already exists.

For teams with solid foundations, AI is a force multiplier. For teams with broken processes and dysfunctional systems, AI magnifies the chaos. The report’s conclusion is unambiguous:

The greatest returns on AI investment come not from the tools themselves, but from a strategic focus on the quality of internal platforms, the clarity of workflows, and the alignment of teams.

With 90% of organizations now using AI in software development, a 14% surge from last year, the question isn’t whether to adopt AI. That decision has been made. The question is whether your delivery system can handle what AI amplifies.

I’m Ron Powell, and I’ve spent the last seven years at CircleCI analyzing how software teams deliver value. As the primary author of all six editions of CircleCI’s State of Software Delivery report, I’ve watched thousands of organizations navigate technological shifts. But nothing compares to what AI is doing to software delivery right now.

The promise that became a problem

Six months ago, your engineering teams started adopting AI coding assistants. The early metrics were intoxicating. Developers reported 30%, 50%, sometimes 100% faster code generation. A junior engineer could produce in an afternoon what used to take a senior engineer days. Productivity dashboards lit up green. The board was impressed. You were ahead of the curve.

But then something strange happened. Despite all that accelerated code generation, your deployment frequency stayed flat. In some cases, it actually declined. Features took just as long, sometimes longer, to reach production. Customer complaints about bugs increased. Your recovery times stretched from hours to days.

The DORA report confirms what you’ve been experiencing: AI boosts throughput, but often at the cost of stability. Teams are generating unprecedented volumes of code, but that code is crashing against validation systems never designed for machine-scale development.

Welcome to the AI delivery bottleneck, where the promise of AI acceleration meets the reality of human-speed validation.

The hidden crisis in your CI/CD pipeline

Having analyzed tens of millions of workflows for this year’s State of Software Delivery report, I can tell you exactly why AI breaks traditional delivery systems. Let me walk you through what’s actually happening in your pipelines through the lens of four critical metrics.

The throughput paradox

Elite teams achieve extraordinary throughput. Our data reveals that top performers run nearly 4,000 workflows per day on average, and leaders exceed 14,000. These aren’t typos. These are teams that have cracked the code on continuous delivery.

Meanwhile, your throughput remains stuck at dozens, maybe hundreds of workflows per day. Not because your developers aren’t productive. They’re generating more code than ever. But that code sits in queues as review processes and validation pipelines struggle to keep up with the pace and complexity of AI-generated changes.

The duration explosion

Before AI, your pipelines ran in 10 minutes. Clean, predictable, optimized. Now? Those same pipelines take 30 minutes, maybe 45, as AI-generated changes trigger cascading test suites. Our data shows that while high performers maintain median durations of 2 minutes 43 seconds, the average has climbed to 11 minutes, with many teams exceeding 25 minutes at the 95th percentile.

Every additional minute compounds. A 20-minute pipeline means developers context-switch. A 45-minute pipeline means they’ve mentally moved on. An hour-long pipeline? They’re working on something else entirely, and the cognitive cost of returning to fix failures skyrockets.

The success rate illusion

Your success rate might look healthy—our data shows an average of 82% on main branches. Your quality gates are working, catching issues before production. That’s good, right?

Yes and no. The success rate tells you that your CI/CD system is functioning as designed. It’s catching bugs, flagging security issues, identifying performance problems. But each failure now carries a hidden cost that success rate metrics don’t capture.

The MTTR nightmare

This is where the real damage appears. Mean Time to Recovery, how long it takes to fix a failed build, has become the silent killer of AI-powered development.

When human-written code fails, developers usually know why. They wrote it. They understand the context, the patterns, the likely failure points. Recovery is quick: our data shows a median MTTR of 63 minutes.

But AI-generated code? That’s different. When it fails, developers face a mystery. They didn’t write the code. They don’t understand the patterns. They lack context for the implementation choices. What should be a five-minute fix becomes a two-hour archaeological expedition through generated logic they’ve never seen before.

Our data exposes the core issue: while median MTTR sits at just over an hour, the average has exploded to 24 hours—23 times the median. This rightward skew tells a story of teams drowning in AI-driven complexity.

The human cost of machine-speed development

Behind every workflow and recovery metric are engineers struggling to adjust to the new pace of work. Their ability to adapt may be the single biggest factor determining your business success over the coming years.

Your senior engineers, the architects of your systems, the mentors of your teams, are drowning in code reviews. A junior developer using AI generates a 2,000-line pull request in five minutes. The senior engineer reviewing it needs three hours to understand what was generated, verify it meets architectural standards, and ensure it won’t create technical debt.

The math is brutal. If each of your 50 senior engineers spends four hours daily reviewing AI-generated code, you’re burning 200 senior engineering hours every single day on validation. That’s 50,000 hours annually, the equivalent of 25 full-time senior engineers, consumed by review overhead. Your most valuable engineers aren’t building the future; they’re trying to understand what machines built while they were sleeping.

Trust has become the scarcest resource. The DORA report found that 30% of developers have little to no trust in AI-generated code. They’re using it because they have to, but they don’t believe in what it produces. This creates a toxic dynamic: developers generate code they don’t trust, reviewers validate code they don’t understand, and everyone hopes the tests catch whatever they missed.

The three shifts to autonomous validation

After years of tracking how elite teams evolve their practices, I’ve identified three fundamental shifts that separate organizations thriving with AI from those drowning in it.

Shift 1: From sequential to parallel intelligence

Stop thinking of validation as something that happens after code generation. Validation should run continuously, in parallel, at machine speed. By the time a developer submits a pull request, most validation should already be complete. The system already knows if it works, if it’s secure, if it matches your patterns.

Shift 2: From static to adaptive validation

Your pipeline shouldn’t run the same tests on every change. A one-line configuration update doesn’t need the full regression suite. A change to payment processing needs everything. Adaptive validation understands context and responds intelligently, maintaining quality while dramatically reducing feedback time.

Our data shows that [intelligent test selection](https://circleci.com/docs/test-splitting/) based on change impact can reduce feedback time by up to 97%. A 45-minute pipeline becomes sub-3-minutes. That’s not corner-cutting—that’s intelligence applied to validation.

Shift 3: From human to autonomous recovery

When builds fail at 2 AM, humans shouldn’t be paged to fix flaky tests or configuration drift. Autonomous agents should handle the mechanical work, fixing flaky tests, updating dependencies, resolving configuration conflicts, while humans focus on architectural decisions and complex problem-solving.

CircleCI’s introduction of Chunk, an autonomous agent for CI, demonstrates this shift in action. During private beta, Chunk automatically opened pull requests for 90% of flaky tests analyzed. It works while teams sleep, turning red builds green without human intervention.

This shift transforms your CI/CD pipeline from a source of toil into a self-healing system that improves continuously without burning out your team.

The platform engineering foundation

The DORA report emphasizes that 90% of organizations have adopted platform engineering, making it table stakes. But adoption isn’t implementation.

To help teams break through the AI delivery bottleneck, platform teams need to enable the shift to autonomous validation across the entire organization. That starts with well-defined standards that can be rolled out frictionlessly across teams and preconfigured pipelines that come with the testing, security, and delivery tools required to support those shifts at scale.

CircleCI’s platform team toolkit gives platform teams the framework to enforce these standards and embed fast feedback, context-aware testing, and automated recovery into every workflow by default.

Value Stream Management (VSM) ensures these improvements focus on the right constraints. As DORA warns, “Without mature VSM practices, AI risks creating localized efficiencies that are simply absorbed by downstream bottlenecks.” Platform teams can use delivery data to identify where work actually gets stuck, implementing targeted improvements to keep it moving. Otherwise, AI acceleration just creates faster pile-ups at the same old constraints. This is the AI delivery bottleneck. Don’t just push the bottleneck downstream.

The competitive imperative

Every day, AI widens the gap between elite performers and everyone else.

Our State of Software Delivery data shows that elite performers already achieve 5x higher throughput than low performers. Add AI to that equation, and we’re looking at differences of orders of magnitude. While you deploy once a day, your competitor deploys 50 times. While you react to customer feedback next week, they iterate multiple times today.

For a 500-developer organization, the business case is clear:

Reducing workflow duration by 10 minutes, whether from 20 to 10 minutes or 90 to 80 minutes: $1.1 million annual savings
Cutting MTTR from 4 hours to 90 minutes: $3 million in recovered productivity

Every improvement in feedback speed or recovery time strengthens an organization’s ability to deliver and respond. They’re the difference between winning and watching competitors disappear into the distance.

The choice before you

The DORA report confirms what our data at CircleCI has shown for years: AI is an amplifier. It will amplify whatever system you have. If that system is optimized for human-speed development, AI will amplify its constraints until they break. If that system evolves for machine-speed validation, AI will amplify its capabilities until you dominate.

The AI delivery bottleneck is real, measurable, and growing worse every day. But it’s also solvable. Organizations investing in autonomous validation, platforms like CircleCI with capabilities like Chunk, are transforming AI from a source of instability into sustainable competitive advantage.

Don’t let AI-generated code compromise your ability to ship. Make the three shifts. Build on a platform that enables autonomous validation. Turn the AI delivery bottleneck into your competitive advantage.

Some of your competitors are doing this already. See how CircleCI transforms AI chaos into competitive advantage by booking a demo today.

Site

Blog