

TL;DR
- Software quality metrics measure reliability, security, and delivery outcomes.
- Key metrics include CFR, MTTR, defect escape rate, deployment frequency, code coverage, rework rate, and lead time.
- Focus on metrics that reflect business value, not vanity numbers.
- Prioritize based on your team’s pain points.
- qTest helps with visibility, testing, traceability, and reporting.
Why do you need software quality metrics? So you can measure how well you can maintain your code, how reliable and secure it is, and how well it does its job.
These metrics provide the objective standards you need to improve your team’s efforts throughout the development life cycle.
The right software quality metrics make a difference because they measure outcomes instead of activities. You don’t have to look far to find dashboards overflowing with vanity metrics that don’t align with your business objectives.
Only a handful of metrics will aid you in driving improvement. Let’s look at seven of the most important ones.
Software quality metrics
Before we delve into the individual metrics, let’s start with a concise definition.
Software quality metrics quantify the reliability, maintainability, security, and performance of applications throughout their development life cycles. In order to be useful, they need to be objective: they must be measurable, actionable, and reproducible.
The right software quality metrics make a difference because they measure outcomes instead of activities.
The DORA metrics
Covering critical quality metrics without mentioning the DORA metrics would be a critical oversight. The DevOps Research and Assessment (DORA) team gave the metrics their name when they devised them as an empirical measure of why teams succeed or fail.
Several of the metrics on this list are members of the collection. We’ll point them out as we go.
1. Change failure rate
Change failure rate (CFR) is the percentage of deployments that result in production failures. It’s a direct signal of the effectiveness of your testing and general release pipeline health. It’s also the first DORA metric on this list.
change failure rate = number of failed deployments / total deployments x 100
So, as a simple example, if you deploy forty times over a three-month period and ten of those releases result in a new problem, that’s a 25% change failure rate.
A high change failure rate is an indicator of inadequate testing, problems with your deployment process, or issues with your integration procedures.
A lower rate indicates fewer issues in those areas. But like any metric, chasing a low rate can lead to other problems, such as under-reporting issues or deferring deployments until a “better time.”
What’s a good change failure rate? There’s no universally accepted number, but many organizations consider 15% or lower a good target.
2. Mean time to recovery
How long does it take your organization to recover from an incident? Mean Time to Recovery (MTTR) is the average time it takes from the detection of a production incident to the restoration of normal service. It’s a DORA metric, too.
Incidents are a part of life. Change failure rate measures your ability to prevent them, but they’re going to occur. MTTR tells you how effective you are at getting things back online when they do.
mean time to recovery = total resolution time/number of incidents
Let’s assume that last month you had five incidents. They lasted forty-five minutes, fifteen minutes, thirty minutes, ten minutes, and three and a half hours.
That’s a total of 310 minutes of downtime. Divide that by five incidents, and you have a mean time to recovery of sixty-two minutes.
When you’re measuring MTTR, keep an eye out for a few pitfalls:
- Don’t conflate planned outages with unplanned incidents. MTTR is about how quickly your team reacts to unexpected problems.
- Decide in advance how you’ll measure the time spent waiting for things outside of your control, such as power outages and replacement parts.
- Be consistent regarding when it’s time to stop and start the clock.
A higher MTTR can point to at least two types of problems. First, how long does it take your teams to respond to an incident? Second, how difficult and time-consuming is it to isolate and repair an issue once they get started?
What constitutes a good MTTR? Many experts quote times like thirty minutes or even an hour. But this has a lot to do with your specific industry. The acceptable MTTR for an online retailer differs from that for a SaaS that provides business-critical services.
3. Defect escape rate
How effective is your team at finding problems before they’re shipped to production? Your defect escape rate is the percentage of defects that your end users discover, instead of your testing and quality assurance processes.
It’s the rate of defects that “escape” to production before they’re caught.
This measure reflects the effectiveness of your pre-release testing programs.
A high rate tells you that your quality gates, such as unit tests, integration tests, and staging environments, aren’t catching issues before they reach customers. So, by tracking this metric, you identify the gaps in your testing and adjust your QA processes.
defect escape rate = production defects / (production defects + pre-production defects)
So if, over a six-month period, your users report five defects in production, compared to the fifty-six discovered in testing and pre-production, that’s a defect escape rate of 8% for those two quarters.
That seems like a good number, but have you noticed what’s missing in that figure? What were the severities of those five defects?
If just one of them was a high-priority issue that took down production for days, or even just hours, that 8% is still too high. Conflating minor annoyances with high-severity outages makes this statistic nearly worthless.
Defect escape rate is useful as a trend for your team, but it doesn’t stand alone. Establish a baseline for your team and focus on reducing the rate of high-severity defects.
Deployment frequency measures how often you release new code to your users.
4. Deployment frequency
How often is your team shipping to production? Deployment frequency measures how often you release new code to your users. It brings us back to the DORA metrics, as it’s on that list, too.
Why is how often you ship code a measure of quality? Another way to state deployment frequency is how often you deliver value to your customers.
Rather than counting lines of code or hours worked, deployment frequency is a direct measure of your team’s output. It tells you how efficiently your software delivery pipeline operates.
A higher frequency means:
- Your teams receive user feedback faster.
- You’re delivering changes in smaller batches that are easier to review, test, and debug.
- You reduce risk, since you’re distributing changes and eschewing larger “big bang” releases.
- You have a mature DevOps culture that has automated testing and delivery pipelines.
We don’t need a formula for this one. Pick an interval of time and count your deployments.
Just be sure to account for partial pushes like hotfixes and patches, versus complete deployments. We’ll be coming back to that below when we discuss rework rate.
Choosing an effective deployment frequency requires nuance, because you need to balance frequency with stability. But that’s okay, because you already have change failure and defect escape rates to help you keep things on track.
| Deployment Frequency | Failure & Escape Rate | Result |
| High | Low | High performance—team is fast, but still delivers stable software. |
| High | High | Low performance—team is in a “break/fix” cycle. This indicates a high level of risk. |
| Low | Low | Conservative—team delivers stable software, but at a low rate that may miss opportunities and frustrate users. |
| Low | High | Very low performance—team moves slowly and still delivers unreliable releases. |
5. Code coverage
Even as you strive to improve your deployment frequency, you want to focus on your change failure and defect escape rates. So, you need a way to measure your testing efforts.
Code coverage is the percentage of your code exercised by automated tests. While this metric is not a comprehensive indicator of test quality, it’s a useful gatekeeper and indicator of hygiene.
While test runners usually calculate code coverage, let’s examine the formula.
code coverage = number of lines executed by tests / total executable lines
This formula tells you the percentage of lines that your tests run. It doesn’t tell you how well or completely you’ve designed your tests, though.
There may even be circumstances where a test executes a line as part of a check for a different bit of behavior. So, in some circumstances, this metric may lead to a false sense of security. High code coverage does not equal high code quality.
But code coverage is a useful metric in several ways:
Code coverage is the percentage of your code exercised by automated tests.
1. Highlighting untested areas
Testing tools tell you which lines your tests didn’t execute. While running a line isn’t a guarantee of effective testing, not running it means no testing occurred at all.
2. Quality baseline
A minimum threshold for code coverage acts as a gate for code quality. It prevents work with no tests from entering the codebase.
3. Dead code
If a section of code remains at zero percent coverage for a long time, it might be dead.
4. Regression safety
Good code coverage provides a safety net for detecting regressions early.
It’s not uncommon for teams to chase quantity over quality. They focus on raising the percentage rather than improving the quality and usefulness of their tests. Code coverage is an important metric, but like the others in this list, it doesn’t stand alone.
6. Rework rate
An unplanned deployment means your team had to address an unexpected issue. Rework rate measures the number of deployments that are unplanned, such as patches, hotfixes, or rollbacks. It became the fifth DORA metric in 2025.
rework rate = unplanned deployments/total deployments
So, if your team executes ten deployments during a sprint and three of them were hotfixes, your rework rate was thirty percent.
Rework rate (RR) might look, at first glance, like another way to calculate change failure rate. But while CFR focuses on the rate of errors, the rework rate is a measure of the effort involved in addressing these problems.
One way to think about the difference is in terms of stability versus effort. CFR shows you how stable your production systems are. RR shows you how much effort you’re spending on fixing new problems.
A high rework rate has a negative impact on several of the metrics we’ve covered already.
| Metric | Impact of high rework rate |
| Deployment frequency | Effective throughput drops because rework consumes development capacity. |
| Lead time for changes | Increases as planned work queues behind urgent fixes. |
| Change failure rate | Correlated but, as covered above, distinct—CFR measures stability, rework rate captures effort. |
A high rework rate points toward code that’s not completely ready for production.
You might see a high RR alongside a high defect escape rate, meaning that your testing processes need work. It can also lead to lower team morale if new defects frequently redirect your engineers from planned work.
Lead time for changes is an indicator of how responsive and how efficient your development organization is.
7. Lead time for changes
Lead time for changes (LTC) is the time between committing a change to version control and a successful deployment to production. It’s an indicator of how responsive and how efficient your development organization is.
This is another metric that doesn’t need a mathematical formula. To calculate it, you track when your team considers features complete and when they are deployed and working. This calculation is simple, but not easy.
You can’t track LTC if your development processes and delivery pipelines aren’t consistent, well-documented, and well-organized.
There are several reasons why you want a shorter LTC:
1. Faster feedback loops
Getting changes through testing and into production faster delivers feedback to your developer and product teams faster, too.
2. Smaller batch sizes
Delivering changes to production more rapidly almost always leads to smaller batches that are lower risk and easier to test.
3. Improved agility
When you deliver value faster, you’re more agile. You can pivot quickly based on client feedback and business change.
4. Developer satisfaction
We mentioned developer frustration when your rework rate is too high. A low lead time for changes is the converse: when you deliver changes quickly, your developers spend less time waiting and more time building momentum.
Choosing the right metrics
Deciding which metrics to focus on can be tricky. Rather than picking ones that sound interesting or get a lot of online attention, use them to help you improve your results.
What are your pain points?
One way to choose the right metric is by identifying how it can help with your current problems.
- Are you shipping new features and customer value too slowly? Work on your lead time and deployment frequency. Make an effort to shorten the time between finishing new features and getting them into production.
- Are you spending too much time putting out fires? You need to improve your change failure rate and mean time to recovery.
- Are your developers fixing existing features instead of writing new ones? Rework rate and defect escape rate will tell you how to improve.
- Is your code brittle and resistant to change? Code coverage may be the leading indicator you need to break those logjams.
Pick a lagging metric to track success and a leading measure to drive your behavior.
Leading vs. lagging indicators
Software quality metrics fall into one of two categories. Lagging metrics tell you what already occurred: change failure rate, mean time to recovery, defect escape rate, and rework rate are indicators of how things are going.
Leading metrics predict what may happen: code coverage, deployment frequency, and lead time to change are indicators of ongoing processes. Pick a lagging metric to track success and a leading measure to drive your behavior.
Avoid vanity metrics
Your goal is to improve outcomes. Anything else is noise. So always be sure to focus on the metrics that drive results, not the ones that sound good online or on paper.
Focusing on a metric like code coverage is a common trap. “95% code coverage!” sounds impressive, but it’s not a guarantee of quality.
Coverage can be useful if you’re trying to manage technical debt or improve performance in a specific module. But taken on its own, it’s merely a distraction.
Result-oriented metrics like CFR and MTTR should always be a part of your dashboard.
Metrics are not goals
One of the biggest pitfalls in adopting metrics is treating them as performance targets. Use these measures as tools for improving your team’s performance, not as a carrot or stick for quarterly performance reviews.
Google’s DORA guide puts it succinctly: “…making broad statements like, ‘Every application must deploy multiple times per day by year’s end,’ increases the likelihood that teams will try to game the metrics.”
How to prioritize your efforts
If you’re starting from scratch, you can’t focus on all seven metrics at once. You need to prioritize. This table will help you decide where to begin.
Metric Comparison & Selection Matrix
| Metric | Best Used When… | Risk of Misuse |
| Lead time for changes | You feel slow, releases are infrequent, or time-to-market is a competitive disadvantage. | Teams rush code to lower the number, increasing bugs. |
| Deployment frequency | You’re stuck in “big bang” release cycles or your lead time for changes remains high. | Teams deploy trivial changes just to boost the count. |
| Change failure rate (CFR) | You have frequent outages and rollbacks. | Teams stop deploying to keep the number low. |
| Mean time to recovery (MTTR) | When things break, they stay broken for too long. | Teams focus on quick fixes and workarounds that don’t solve root causes. |
| Rework rate | Your teams are spending too much time fixing bugs rather than building features. | Hard to define “unplanned” consistently across teams. |
| Defect escape rate | Customers are finding bugs that QA missed, or your rework rate remains stubbornly high. | Can create a “blame game” between dev and QA. |
| Code coverage | You are refactoring legacy code or need to prevent regression in critical paths. | Teams write useless tests just to hit an arbitrary target. |
Choosing the right tool
After you identify your critical metrics, you need the right tools for addressing them.
You can link three of the metrics we covered—change failure rate, defect escape rate, and code coverage—directly to effective testing.
You can also enhance three more—deployment frequency, rework rate, and lead time for changes—with faster and more robust testing procedures. So, let’s see how Tricentis’s qTest can help.
Use case: Increasing enterprise quality with Tricentis Tosca and qTest
Problem
Clayton Homes was struggling to keep pace with development while managing their tests manually. Their technology stack featured very complex data flows, and they lacked the transparency and scalability they needed to grow.
Solution
They kicked off a modernization process that would move them from a reactive to a proactive approach for identifying defects and improving their product quality. A large part of that process involved pairing Tricentis qTest with Tosca.
Outcome
The new approach gave Clayton increased efficiency and improved visibility into their testing processes.
With qTest, they can track coverage and map tested features directly to business risks. Not only are their pipelines running more efficiently, but they’ve seen improved team morale, too.
How qTest Helps
Agentic test creation
Success in three of the metrics we’ve discussed relies on your testing. Defect escape rate and rework rate point to issues that make it to production.
Code coverage points to code that lacks tests. You can address all these issues directly with qTest Agentic Test Creation.
These agents can generate tests for paths that manual creation often misses. This means you have better coverage and better odds of catching defects before they escape to production and generate rework.
More tests should result in more reliable software, but without the right automation tools, it can also mean longer lead times and reduced deployment frequency.
AI-scaled automated testing
More tests should result in more reliable software, but without the right automation tools, it can also mean longer lead times and reduced deployment frequency.
Fortunately, qTest makes it easy to sync those AI-generated tests to Tosca, your automation platform.
This automates the step in your QA procedures that creates the longest wait time. So, your team gains the confidence to create more deployments while still catching regressions faster.
Scalable test operations with traceability
Traceability is one of the most effective tools you have for keeping change failure rate low and time spent recovering from failed deployments from climbing up.
It makes it easy for your team to start with a requirement and follow it right through to its final test before they push to production.
Analytics and reporting
You can’t improve what you can’t see. That’s why you need customizable, powerful dashboards that make it easy for you to follow metrics like coverage, defects, and velocity. You get all this, and more, with qTest’s out-of-the-box and custom reports.
Improving your software quality
We’ve covered seven of the most important software quality metrics. Change failure rate, mean time to recovery, defect escape rate, and rework rate offer valuable insights into how your applications perform in production.
Deployment frequency, code coverage, and lead time for changes are indicators of how your internal processes deliver value. Taken together, they offer a robust guide to improving your software quality by measuring outcomes instead of hype.
Check out qTest today and make the first step toward driving software quality with agentic testing, enterprise automation, and customizable dashboards.
This post was written by Eric Goebelbecker. Eric has worked in the financial markets in New York City for 25 years, developing infrastructure for market data and financial information exchange (FIX) protocol networks. He loves to talk about what makes teams effective (or not so effective!).
