How software teams can measure anything: The QPPE metrics model

Software metrics do not have to be difficult. In fact, I've successfully used only four key metrics to measure just about everything for several years now.

But before describing those four metrics, here's a bit of background. When I was an IBMer, I was asked to join a team of about 12 people to reduce a set of about 120 candidate measures to a top 10. It was believed that this would make it easier for customers to adopt.

The list left me puzzled. Very few had definitions, so I didn't understand many of the metrics. None had corrective actions and control ranges. I'd been helping my clients measure their software development organizations for years and told the group that, honestly, we can measure every one of them with just five metrics.

It was common wisdom in this group that measuring organizational transformation, software or otherwise, involved a multi-step process: For each client, we would find the business drivers. We would explore the client's unique issues and goals. We would look at what the client measured already. It was quite a long ordeal.

So to suggest that none of that was necessary raised a lot of questions. But ultimately, the group adopted my suggestion for the five. Here's how software teams can measure anything.

World Quality Report 2018-19: The State of QA and Testing

The QPPE metrics model

Those five metrics have been reduced down to four as they evolved through use at clients in various industries over the years. But they have been stable since 2012. The order is important as well.

  • Quality
  • Predictability
  • Productivity
  • Engagement

    This year, my colleague Bobby Walters decided to pronounce them "kew-pee" metrics and I loved it. Naming ideas gives them power and makes them much easier to share.

    If you wish to use something else, by all means invent away. But get your entire organization to commit to your metric framework, and keep it simple. I have worked with clients since 1999 and have been helping them measure success since 2003.

    Once I found the QPPE measures, every client I have had has agreed to this framework. This was because they either hadn't been measuring anything at all, or they had been, but with disappointing results.

    My clients can now try any innovations that come to mind and then use "innovation accounting" to measure whether the innovation is actually helping them. This can include going agile, using offshore services, adopting SAFe, using Rational Team Concert, or adopting Jira.

    You might also try using my FAST estimation technique to estimate more user stories, and to do so more effectively. From small ideas to giant transformations, this technique plus the QPPE metrics can boost your success rate every time. 

    QPPE metrics in real life

    This allows three amazing things. One, we actually know which ideas to keep investing in. Two, no one bickers about what to measure, which means a faster time to results. And three, people stop saying no to new ideas. Instead we say, "Sounds like a throwdown!" and then run the experiment using the QPPE metrics to evaluate the results. Imagine working on teams that always say yes to new ideas.

    Sometimes we don't even run the experiment. When an idea is suggested, we "back-of-the-napkin" our way to what we believe we'll find. If the new idea wins on all four metrics, we know we have ourselves a hit. Often, one or two are clear winners, one seems like a draw, and one is unclear. We then jot down what makes it unclear and run the experiment looking for those answers.

    Don't reward good metrics, and don't punish poor ones. The latter will lead to people "cooking the books"—and both may be rewarding coincidence instead of intent or technique. Instead, watch for metric shifts and watch your metric outliers. Look for innovation ideas.

    Then apply them to other teams. If their metrics improve after also performing the innovation idea, reward all the teams that validated the innovation idea and ask them to help spread it to the rest of the organization.

    What to measure

    People often struggle with what to measure. You can use the QPPE model at any level: team, program, portfolio, transformation. But it is important to start small and test your architecture. I ask my clients to capture just one simple metric across their target organization. Once we prove we can do that, we can then add the more complicated metrics to the model.

    By doing this, I find who the arguers are, the people who try to make a simple metric much more difficult. I find what parts of the organization are more resistant and which are the innovators and early adopters. It also forces them to find a central repository for their metrics, to which everyone can read and write. This is usually no small feat.

    In addition to the single metric, there is a team profile as well that allows us to filter our metrics. This includes things like sub-organization, lifecycle, and weeks as a team—any attribute that you would want to query your metrics on.

    Measure your releases. Later you can look at measuring other things such as teams, programs, and portfolios. But releases are the key to software success. So that means you need to measure release quality, release predictability, release productivity, and release engagement.

    QPPE explained: Quality

    We define quality as being value and defects. It has always struck me as odd to hide value under quality, and I have often toyed with pulling it out into its own category. But every time I try it, the model only seems to suffer for it.

    The idea is this: A quality product is valuable. Think about software that you use but is buggy. But you keep using it, because the value exceeds the impact of the bugs. Companies have made millions by building software so valuable that people tolerate the bugs.

    There are many ways to measure value. But the top three are revenue, actual cost reduction, and a perceived value survey where stakeholders either rate a release on a scale or as a percentage. So, for example, a survey could be done on each release asking how much value your users received.

    Measuring quality is much easier and makes for a great starting metric. Count the number of production defects over time. Overlay your releases and watch how the defect trends shift right after a release.

    When we try new innovations, it is critical to measure the impact on quality. If quality takes a hit, usually that is a sign that the innovation is not working out.

    The QPPE measures counter-balance each other. Many companies want to do a productivity play. But if they don't also measure quality, people start optimizing productivity at the expense of quality. So this is why quality is the first measure in the QPPE model. It is the most important one, especially since in our definition it includes value.

    Figure 1. A quality dashboard showing value as well as defects over time.

    Predictability

    The second measure in the QPPE model is predictability. Which is "better"—a team that delivers 10 points, 100pts, 15pts, and 115pts across four sprints, or a team that delivers 30, 30, 30, 30? 

    Predictability should be mastered first. Once that is achieved, then we can start looking at productivity plays to see if we can raise our productivity while maintaining both quality and predictability. This aligns nicely with the Agile Manifesto's principle of having a sustainable pace.

    There are many cool ways to measure predictability. But for starters, again look at the release. How good are we at releasing on time, on budget, on scope, on quality, and on value? Use a simple pie chart that shows the dollars spent on failed projects, severely impacted projects, impacted but successful projects, and strongly successful projects.

    Then watch to see which innovations improve that pie. You can create your own definitions, but here are some to consider:

    • Successful as planned: Within 5% of original plans on time, cost, scope, quality, and defects.
    • Impacted: Within 20% of original plans.
    • Severely impacted: Anything greater than 20% of plan, but the value delivered was greater than the cost to build.
    • Failed projects: Canceled projects, projects that deliver zero value, projects where the cost to build exceeded the value gained. Agile promises to reduce the cost of failed projects by exposing bad projects earlier than did traditional waterfall.

    Figure 2: This project success dashboard shows overall initiative success. Click here for larger version.

    Another key metric in predictability is time to accuracy. If you do waterfall, agile, SAFe, Kanban, etc., at the start you may try to predict your releases. None of these methodologies changes how accurate that first guess is.

    But agile methodologies promise a faster time to accuracy. For example, a company predicts that a project will cost $1 million and take a year to build. In the end, it costs $3 million and takes three years. At what point did they know it would be over budget and way beyond the time frame? I have clients who didn't know until 2.7 years in that their one-year project was going to turn into three years. And had they known, they would never have taken that project on. In this scenario, the time to accuracy would be 270%.

    Agile doesn't solve bad estimation. But it promises to expose the real answer faster. So it is exciting to compare the time to value for agile projects versus the time to value for traditional to see if it is doing the job.

    Figure 3: Time to Accuracy. How long should an initiative run before estimates match actuals? Lower is better.

    Productivity

    The third metric is productivity, and it is third for a reason. We need to master quality and predictability before we start messing around with increasing productivity. But it is the easiest metric to capture. So while it is third in importance as a target of innovation, my clients often master capturing organization-wide metrics with this one first.

    The QPPE metrics work regardless of which lifecycle teams are using: agile, waterfall, Kanban, it all works. And if I add another lifecycle to the metrics profile, I can actually see if one lifecycle outperforms another.

    We define productivity as time, cost, and scope—the iron triangle. Predictability shows us how quickly we can predict the final numbers. But in general what are the numbers? And how are we doing over the last three years?

    Here we can measure scope delivered using #projects, #releases, and even #portfolio points. We can study the cost of releases. But the best metric in this space is the easiest metric of all: mean time between releases (MTBR). It is very easy: It's the number of calendar weeks/number of releases to production.

    When that number shrinks, it means we are putting software out more frequently. If quality (including value) and predictability remain constant, this can be a big win for the organization.

    People push back on MTBR with statements such as, "Well, we could release more often, but it would be stuff that is less valuable." Or, "What if our client can't take on that much change?"

    Regarding the first statement: No one gets rewarded solely for improving MTBR. Now, if you have an innovation that improves MTBR, such as implementing DevOps or continuous integration, you will be rewarded if your innovation helps multiple teams improve their MTBR.

    But why purposefully release less value just to make a metric better? And how will you get all those other teams to game the system with you? I trust my teams wouldn't do that. But if they did, our quality/value metrics would take a dive.

    For the second, if your client can't take more releases, do what is right for your client. Again, no one gets rewarded for their MTBR metric.

    If you want to start with one easy metric to test your metrics architecture—to assess your organization's ability to come together on one simple idea—start with MTBR.

    Figure 4: Productivity Dashboard. How much scope do you deliver over time? What does it cost? How long does it take?

    Engagement

    The final metric is engagement. In this case, it isn't the least important, but it fits at the tail end of the game. We define engagement as:

    • Are they doing it? 
    • Are they doing it well?
    • Do they like it?
    • Do they think it is a good idea?

    Imagine you adopt agile, and you measure your releases. After a year, you notice that none of the QPP metrics got any better. What happened? Is the innovation idea a flop? Or maybe they said they were doing agile, but really kept doing waterfall and just used agile words.

    Engagement tells us if the movement or lack thereof in the QPP is real. If they didn't actually do the innovation idea, or tried and abandoned it, or tried but did it poorly, that may be why your metrics didn't move. But if your innovation metrics look good and the QPP isn't moving, the innovation idea may really be a dud.

    Figure 5: Employee Engagement. Are your development teams engaged? Do they think they are on the right track?

    Lightweight examples

    So now that you understand the QPPE measures, you can apply them to anything.

    We used it to prove FAST estimation was better than planning poker, an estimation and planning technique. The accuracy (predictability: average points planned versus actual per sprint) of the techniques were equal, but FAST estimation allowed for 40 stories on average to be estimated, while planning poker averaged 8 per hour. It also won on the engagement scores. 

    We used it to prove "a user can" was a better technique than "as a <role> I want to <goal> so that I can <reason>." "A user can" had higher productivity—teams could write about four times more stories in the same two-hour periods—equal quality, and higher engagement scores.

    Finally, here are all the release metrics we looked at on a single dashboard for a three-year period. If you have no historical data, show your metrics by quarter until you have two years of history, then switch to yearly.

    Figure 6: The Full Release Dashboard. Here are the QQPE metrics for releases all on one page.

    Get started

    You don't have to invent new metrics for every client or every innovation to prove things are getting better. If you commit to the QPPE model, you'll find you can measure just about anything and reduce the time to figure out which innovations are working for your company. Don't reward or punish good or bad metrics; instead, reward innovations that move multiple teams in a positive direction. If you measure only one thing, measure your releases.

    All images courtesy of Anthony Crain.

    Topics: Dev & Test