
In this article, we’ll cover the concept of mutation testing: what it is, why it’s important, and what to know in order to get started with it. As you’ll learn, mutation testing isn’t necessarily a distinct category of testing. It’s not, for instance, a competitor to unit testing, but quite the opposite: it’s a technique or approach you can use to make your unit tests better.
What is mutation testing?
Definition
Mutation testing is an approach you can use to assess—and then, hopefully, improve—the quality of your unit tests. We’ll explain it in more depth soon, but for now, understand that, in practical terms, mutation testing is a type of automated software testing that verifies your unit test suite instead of your production code.
What are the benefits of mutation testing?
One of the toughest challenges you can face when writing unit tests is assessing how good your tests are. Yes, code coverage is an important metric, but it can be illusory. It’s totally possible to have high coverage in a suite composed of low-quality tests.
“Test coverage is a useful tool for finding untested parts of a codebase. Test coverage is of little use as a numeric statement of how good your tests are.”
– Martin Fowler, “Test Coverage”
Mutation testing is the answer to that problem, since it gives you a clear assessment of how capable your current tests are of catching real potential problems you could introduce to your codebase when changing it.
How does mutation testing improve the quality of my test suite?
To understand how mutation testing works and in what way it identifies weak spots in your unit testing suite, let’s imagine what it’d be like to do what mutation testing does, but manually.
Let’s say that all of the following is true:
- You have a function f
- There are unit tests that exercise f
- Currently all tests pass
- You’re confident that f’s code is correct
Based on the above, if you deliberately introduced a defect into your function, one or more unit tests should fail.
If, after introducing some damage to the code of f, all of your tests are still passing, that means that you should either fix some of your current tests or write more tests.
Mutation testing automates all of this. A mutation testing tool will add those defects to your code, run your tests, and then let you know whether the tests were able to detect the defects.
Core concepts of mutation testing
Having briefly explained what mutation testing is, let’s now go a bit deeper and also cover some of the terminology associated with it.
In a simplified way, this is how the mutation testing process works:
- A single defect is added to the application—these are called mutations
- All tests are run
- If at least one test failed, we say the tests killed the mutation; otherwise, we say the mutation survived
- The defect is removed
- The mutation testing tool adds another defect, and the process repeats
After the process finishes, we’ll get a ratio of killed mutations within all generated mutations. The higher the rate, the better.
A mutation testing tool doesn’t just change your code at random.
Types of mutations
A mutation testing tool doesn’t just change your code at random. Usually, they apply those changes in a highly systematized way, picking from a list of pre-defined mutations.
The exact list depends on the specific tool and also the specific programming language it targets. What follows is a non-exhaustive list of common categories of mutations you can find:
- Arithmetic mutations: such as replacing a plus sign with another operator.
- Logical mutations: replacing an and operator with an or, or vice versa. Also, reversing an if statement.
- Equality mutations: swapping equality/comparison operators.
- Literal mutations: replacing hard-coded numbers or strings with random values; replacing boolean literals with their opposites.
- Removal mutations: removing keywords, expressions, or even whole code blocks.
These are just some of the categories of mutations that many tools use. Tools will also provide more mutations that are specific to the language they target. For instance, a tool targeting C# might offer mutations targeting LINQ methods.
How does it differ from traditional software testing?
To explain how mutation testing differs from traditional software testing, we must first define what we mean by “traditional.” If by traditional we mean manual software testing, then it’s clear that mutation testing differs from that because it is a form of automated testing.
But if by “traditional software testing” we mean other forms of automated testing? In this case, mutation testing differs from these types of testing in the following ways:
- Unlike unit testing, integration testing, and so on, you don’t write any test cases.
- Mutation testing, unlike unit testing, is usually very slow.
- Because of the above, you’d typically not integrate mutation testing into your pipelines.
- Mutation testing, unlike unit testing/integration testing, doesn’t target your production code, but your test code.
Regarding the goal of mutation testing, the most striking way it differs from other forms of testing is probably #4. That is, mutation testing tests not your application, but your unit tests. When it comes to the creation of tests, you just don’t; many tools will allow you some degree of control and configuration, but which mutations will be deployed to which portion of the code is something the tool itself decides for the most part.
Finally, don’t feel scared about #2 and #3: while mutation testing can be quite slow, it’s not in a way that prevents its use in a practical way. We’ll cover more of that next, when we mention the challenges of mutation testing.
While mutation testing can be quite slow, it’s not in a way that prevents its use in a practical way.
Challenges of mutation testing
Mutation testing is a great idea that’s hard to get right in practice. There are many challenges involved. For instance:
- Mutation testing can be extremely slow, considering that mutations are applied to the code one by one, and then for each mutation the whole unit test suite is executed.
- Not all possible mutations are useful, so it’s often hard to determine and apply the ones that correlate with realistic errors that could end up in a code base.
- Mutations need to be applied “cleanly” to a codebase. That is, code must still compile (if we’re talking about a compiled language), which can be tricky to achieve.
Thankfully, there’s a good deal of research happening on the topic of mutation testing, and many of the challenges above are being solved in practice. For instance, some mutation testing tools, when integrated with pipelines, are smart enough to only generate mutations for the portion of the code impacted by the pull request, thus reducing the total run time for the tests.
As a general recommendation, don’t add mutation testing to the same pipeline that gets run after every PR is open or merged. Maybe create an entirely different pipeline that runs nightly or even weekly, runs only the mutation testing tool, and uploads the resulting analysis somewhere where all interested parties can see.
As an approach, mutation testing can be a really valuable tool in your toolbelt.
Conclusion
Mutation testing is a somewhat different form of testing: you don’t write test cases, and it targets your tests instead of your actual application. As an approach, mutation testing can be a really valuable tool in your toolbelt. You can use it to make code coverage a more meaningful metric and reach the highest possible quality for your unit test suite.
What are the actual tools you can use? For Java, the state-of-the-art tool seems to be PIT Mutation Testing. Stryker Mutator is another popular tool, and it supports C#, JavaScript, and Scala. For Python, mutmut seems like the most popular currently maintained tool.
This post was written by Carlos Schults. Carlos is a skilled software engineer and an accomplished technical writer for various clients. His passion is to get to the bottom (the original source) of things and captivate readers with approachable and informative technical content.
