

Thanks to the recent large language model (LLM) advancements, nowadays, AI unit testing is a reality. But how can it help you?
While AI unit testing is not a panacea, it’s undoubtedly a dramatic shift in the way software teams work with unit testing. If done right, AI unit testing can bring immense value to your team.
That’s why this guide exists: to show you the main pain points of manual unit testing, how AI unit testing solves them, and some best practices you must be aware of.
What is AI unit testing?
AI unit testing refers, unsurprisingly, to the usage of AI to help in the creation and maintenance of unit tests. In the context of this guide, when we say “AI,” we mean specifically the recent crop of tools/assistants powered by LLMs, also called generative AI.
Unless we say otherwise, we’re not referring to the overall broad field of artificial intelligence.
Defining AI unit testing isn’t hard, but what makes it valuable? To understand that, let’s take a step back and focus on traditional unit testing—specifically, its pain points.
Unit testing is a great tool for any developer to have in their toolbox
Traditional unit testing and why it’s painful
Unit testing is a great tool for any developer to have in their toolbox. I’ll sing its praises to anyone patient enough to listen to such singing. But I’d be lying if I said that unit testing is always a walk in the park; it does have its share of pitfalls and pain points, and now we’re going to cover some of them.
Unit testing often involves a lot of boilerplate and duplication
Many unit tests you write end up being very similar to others you’ve written before.
Also, depending on your specific language and testing tools, there is some ceremony required to set up your tests, and that impacts your development velocity.
Unit testing often misses edge cases
This is, of course, not a problem of unit testing per se, but of our limited human brains. Unfortunately, humans often fail to account for unlikely but high-impact scenarios when writing tests. And then such scenarios cause relevant production bugs.
Maintaining unit tests often becomes a burden
Unit testing suites often degrade in quality as they grow. If unattended, the situation might reach a point where the maintenance of your unit tests becomes a burden.
Finally, developers stop caring, because running the unit tests and ensuring they pass takes a greater and greater portion of their time, and they have features to ship.
AI unit testing enters the game
The pain points we’ve covered are far from being the only ones you might encounter; however, we chose them because AI unit testing is specifically well-suited for helping you solve those kinds of issues.
AI unit testing solves boilerplate
AI-assisted software engineering, which includes AI unit testing, really shines when it comes to reducing duplication and boilerplate. Or, better put, reducing the need for you, the human programmer, to write such boilerplate yourself.
The use of AI unit testing can greatly improve the efficiency of writing the tests. For instance: if, for a given test, I need a reasonable number of examples, I’ll often write the first test manually, then hand it to the LLM and ask it to generate N more test cases like this, preferentially using the parameterized-tests feature of my unit testing framework.
That way, I feel like I’m in control, guiding the LLM, but I let the tool greatly increase the speed of my workflow.
AI unit testing helps you consider edge-case scenarios
In my AI unit testing adventures, I’ve had great success using this workflow:
- I show the LLM the class I’m currently testing
- I show it the tests that I have written for that class
- I ask it to identify ways in which my current tests are faulty or insufficient
Often enough, but not always, the LLM does come back with suggestions to strengthen my unit tests, catching nuanced edge cases that I would’ve failed to see by myself.
AI unit testing relieves the burden of maintenance
When you find yourself in the situation of changing a constructor or method argument, and then all of a sudden, dozens of your tests no longer pass, AI can certainly help you fix everything way quicker than you’d do it manually.
But here’s a warning: finding yourself in that situation too often means there are problems in the way you’re creating your tests. You’re likely overspecifying in your assertions and/or writing tests that are too tightly coupled with the implementation of the system under test.
Implementing AI unit testing: best practices
You now understand:
- Some of the main pain points of traditional unit testing
- How AI-assisted unit testing can solve or alleviate those pain points
Now we’re going to cover a non-exhaustive list of best practices so you can get the most out of your AI unit testing.
When adding tests to your suites, work in short iterations—also called baby steps
Use short iterations when writing tests
When adding tests to your suites, work in short iterations—also called baby steps. You might be using test-driven development (TDD), but even if you’re not, there’s nothing stopping you from working in short cycles that ensure that your code builds and all tests pass at all times.
Example of this workflow:
- Write a short amount of code (or prompt the LLM to write it)
- Use the LLM to generate a few unit tests for the written code
- Run the tests and verify that they pass
- Deliberately add a defect to the code you’ve just written and re-run the tests, so you can ensure they fail
- Remove the defect you introduced, and if applicable, test the code manually
- Start the cycle again
Working in small steps ensures that the application is in a good state most of the time and you remain in the loop. Making sure you see the tests failing helps you assess their quality.
The common properties of unit tests — small scope, done by the programmer herself, and fast — mean that they can be run very frequently when programming.
– Martin Fowler, Unit Test
Give detailed instructions when prompting
In my experience, when LLMs write tests, they seem to favor heavy usage of mocks. They seem to favor a style of unit testing more geared toward interaction verification rather than state verification. Personally, I think that leads to tests that are tightly coupled to implementation details and, therefore, flaky.
AIs also seem to like overspecifying when writing tests. That is, their tests will assert against certain details that are not super-relevant to the behavior being tested and that don’t add much value. For instance, asserting against exception messages.
I understand these are opinionated statements. But regardless of whether you agree with me on those matters, make sure to give detailed instructions when prompting LLMs for unit tests. That way, you ensure the results will be aligned with what you consider to be great unit tests.
Make sure people are educated on what constitutes good unit testing
AI-assisted coding is a great help in making teams more efficient. But at the end of the day, we still need humans to be held accountable. That means the human must stay active in the loop, reviewing and having the final say on whether code is up to standards.
But of course, in order to be able to judge what’s good code or not, people need to be trained. Make sure that, as an organization, you are educating people. Especially include junior folks and new grads—on what constitutes good code and good unit tests. Also, make sure that they are able to make judgments regarding AI-generated code.
Conclusion
Unit testing is an awesome ally in the fight against code entropy. But let’s be frank: historically, many developers didn’t like doing it. Often, they only wrote tests because the powers that be mandated it, which resulted in poor tests written just to satisfy coverage metrics.
Besides the benefits we’ve covered in this guide, AI unit testing might also help in that regard. Since LLMs help developers write tests much faster than before, they also recognize flakiness and highlight overlooked edge cases. As a result, this might help disillusioned developers enjoy unit testing again. Ultimately, our codebases will benefit from that renewed engagement.
If you want to go beyond unit tests and see how AI can help teams with automated testing in general, take a look at Tricentis Tosca, an end-to-end testing platform accelerated by AI.
This post was written by Carlos Schults. Carlos is a skilled software engineer and an accomplished technical writer for various clients. His passion is to get to the bottom (the original source) of things and captivate readers with approachable and informative technical content.
