Measuring Software Quality

Measuring quality is difficult – there are many aspects you can track. You certainly can’t just look at a single aspect on it’s own. For example, a product with 0 bugs doesn’t mean that it is high quality software – no one might be using it! However, collecting, measuring and analysing quality stats are overheads – these activities need to provide real business value.

So, which aspects and metrics are most valuable? Here are my thoughts …

Bugs

Simply measuring all the bugs that have been reported is not that valuable. Generally, users don’t care about the bugs in software as long as it doesn’t impede their path in using it. So, in the bug tracking system, I have the following 3 fields that allow focus on valuable portions of the bugs:

  • Customer reported. The first aspect I focus on are bugs reported by customers. If they have reported a problem, it will be impacting their use of the system in some way – so, we need to do something about it. We’ll find a lot of issues during our internal development processes, but these could be on paths that the users won’t go through.
  • Scope. Scope is how many users the bug impacts. I’d generally track this as a ‘high’, ‘medium’ or ‘low’ – a specific number is impossible to gauge. This requires knowledge of how users use the system, so, can be difficult to quantify – particularly if the product doesn’t track feature usage.
  • Severity. This is the level of impact, the bug is causing. Again I simply go for a ‘high’, ‘medium’ or ‘low’. ‘High’ meaning that the user can not use the feature because of the bug and there is no workaround.

I then calculate the priority score for a bug by multiplying the scope, severity and whether it was a customer reported bug:

  • Customer reported = 2, internal reported bugs = 1
  • High scope = 3, medium = 2, low = 1
  • High severity = 3, medium = 2, low = 1

So, a high scope, high severity customer reported bug will have a calculated priority of 18 and a low scope, low severity internally reported bug will have a calculated priority of 1. Obviously the higher the priority score, the more important and urgent the bug is to fix.

I then measure 3 aspects of the calculated priority score over time:

  • Found. This is the sum of the priority score, for all the bugs found in a week / month.
  • Fixed. This is the sum of the priority score, for all the bugs fixed in a week / month.
  • Open. This is the sum of the priority score, for all the open bugs at the end of a week / month.

The goal is for 3 trends to happen:

  • Found score is a stable low score or is reducing.
  • Fixed score is greater than the found score.
  • Open score is a stable low score or is reducing.

Quality Bug Trends

Usage

As I said before, no bugs doesn’t mean high quality software because the usage may be low. Also, an increasing found score doesn’t mean the product is becoming unstable – if lots of users have suddenly started to use the product and reporting lots of bugs, it probably was always unstable. So, I cross reference the usage against the bugs stats.

I measure the number of unique users using the product in a week / month. Unique users are more important than the number of logins because a given user will tend to cover the same functionality in their usage, which is more valuable to cross reference against the bug stats.

Quality Usage Trend

Test Coverage

100% test coverage doesn’t guarantee high quality software – the wrong thing may have been built. Some areas of code will be difficult to cover by automated tests as well – so you may not get a ROI by gaining 100% coverage. However, high unit test coverage indicates the code is well designed and can be changed with confidence.

I measure the total coverage at the end of a week / month. The goal 70% or above.

Coverage Trend

What are your thoughts? Do you use similar metrics?

Connect with me:RSSGitHubTwitterLinkedIn