How DevOps teams are using—and abusing—DORA metrics
Since the 2018 release of the book Accelerate: The Science of Lean Software and DevOps, the “DORA metrics" it introduced have grown in popularity as a way to measure software development.
But they have also been used for the wrong reasons, resulting in poor outcomes. Here's why that's happening—and what you can do about it.
DORA metrics can be a double-edged sword
DORA stands for DevOps Research and Assessment, an information technology and services firm founded founded by Gene Kim and Nicole Forsgren. In Accelerate, Nicole, Gene and Jez Humble collected and summarized the outcomes many of us have seen when moving to a continuous flow of value delivery. They also discussed the behaviors and culture that successful organizations use and provide guidance on what to measure and why. Included are four key outcomes where high-performing organizations excel:
- Deploy frequency
- Lead time
- Change fail percentage
- Mean time to repair
In the following years, as more organizations worked to modernize software delivery, the DORA metrics became the indicator for how those efforts were progressing. This is a double-edged sword, though. Metrics seem very simple, so the attitude is “Let’s measure this and display the data and we’ll know everything.” But when applied to anything involving people, the DORA metrics are complex and require constant vigilance and adjustment to prevent adverse side effects. With the spread of these metrics, we are also seeing the spread of anti-patterns.
When the purpose for metrics is misunderstood, serious issues can occur. We are increasingly seeing DORA metrics used as goals, complete with OKRs (objectives and key results) where the objective is “improve DORA metrics.” But improving metrics should never be your goal. As Goodhart's Law states, “When a measure becomes a target, it ceases to be a good measure.” High-performing organizations did not become high performing from focusing on metric goals. They became high performing by focusing on the customer and how to more efficiently, effectively, and sustainably deliver value. The metrics are the outcome.
Related to this is the idea of using DORA metrics to compare delivery performance between teams. Every team has its own context. The product is different with different delivery environments and different problem spaces. You can track team improvement and, if you have a generative culture, show teams how they are improving compared to one another, but stack-ranking teams will have a negative effect on customer and business value. Where the intent of the metrics is to manage performance rather than track the health of the entire system of delivery, the metrics push us down the path toward becoming feature factories.
Another common problem is the growth of “vanity radiators,” metric dashboards that display numbers but give no obvious clue about what action to take. “We deployed 436 times!” OK, but how big is the organization? Was that 436 times in a week, a day, or a year? Is that number improving or degrading? What action should we try next to improve? More importantly, how quickly are we getting feedback on the value we are actually delivering?
Here's the problem you really need to solve
Are there only four metrics? Not at all. Accelerate identifies four key metrics for measuring the outcomes of your improvement experiments, but those metrics only make sense in the context of everything else they recommend:
“Continuous delivery improves both delivery performance and quality, and also helps improve culture and reduce burnout and deployment pain.”
So you need to solve the problem of “what improvements do you need to ensure your software is always in a releasable state while continuously changing it?” To do that, start with continuous integration. CI is the activity of integrating tested, releasable code to the trunk very frequently, and at least once per day. It is the core behavior that uncovers constraints in an entire organization. It is also the easiest and most accurate activity to instrument to start tracking improvement goals. How long do branches exist? How frequently are changes made to the trunk? How quickly can you identify defects?
Next, you need to improve the flow of delivery so that you can improve your ability to meet customer expectations. How many things do you have in progress? How can you reduce that so you can stop starting and start finishing? What is the total lead time from request to delivery? Value stream-map the process to find and remove the constraints.
You need to focus on customer outcomes, or none of your efforts will matter. Are your users happy? Do you have a stable, available, and secure product that they can depend on? Are new features being used, or should you remove them?
It all starts with building the right culture
All of this depends on having the right culture and working in a sustainable way. Can you create a Westrum culture survey to gain insights into issues you may have that will prevent positive outcomes? Having a culture of trust and a shared mission of learning and improving is foundational. You also need to find ways to measure team stress. Would people recommend others join their team or your organization? How many hours are spent outside of normal working hours on things such as support or releasing new changes? These things cause burnout and turnover. What signals can you track to improve these?
DORA recommends tracking all of these things. A brief reading of Accelerate yields, not four, but more than 14 measurable indicators of organizational and delivery health. All of them are important for understanding the complex interactions of people, process, and products required to continuously deliver value to the end user.
Join me October 5-7 at DevOps Enterprise Summit Virtual - US, where I’ll be discussing these ideas further and looking for deeper conversations related to improving how you can deliver value.