How to test application throughput: Keep it real


Tricentis Staff

Various contributors

Date: Jan. 06, 2020

Often we see a web application measured by throughput. It’s a way of quantifying the volume of requests/responses in relation to time. Transactions per second or TPS is the most common ratio used. A performance test plan usually contains certain throughput goals. The go/no-go decision for rolling out a new release or architectural change relies heavily upon a web application handling a certain TPS. Management wants a “pass” stamp, but it’s your job to make sure that the achieved TPS is indeed realistic — not an illusion of phony numbers.

My advice is to “keep it real” by generating workloads that represent all the true characteristics of production. Faking it leads to false positive test result: a certain TPS was met and the “pass” stamp was awarded, but the conditions were unrealistic.

For example, you could achieve a 460 TPS result by hitting mostly lightweight transactions, or by running a low load of virtual users with little or no think time. In each of these cases, the throughput would be “high,” but the workload does not represent what’s really happening in production. Not even remotely. What’s worse, if you “pass” a performance test using these unrealistic methods, you have no idea if the deployment is going to withstand the production workload. If the applications falls over . . . guess who is on the hot seat? This could be an unintentional outcome, so be sure your tests are set up properly and create a realistic workload.

Start with design

How do I accomplish that? It’s actually the design of the test that is responsible for determining how realistic the throughput is that you generate. There are several key factors here to take into consideration. They are all equally important, but the underlying philosophy is (again) “keeping it real.” Using a load tool to simulate virtual users executing scripts is a load test, but you also need to emulate accurate activity, conditions, behaviors, usage, etc. Each script executed in a load test contains simple requests and more complex business transactions. When setting up a test of an application, not all transactions are created equal. Throughput is affected by the “weight” of a transaction.

For example, lightweight Transaction A can be as simple as serving up a static image. In another example, heavyweight Transaction B can be as complicated as executing a business transaction that involves algorithms to be run on the results of a databse query. The response time of Transaction A is going to be much quicker and will use fewer resources within the deployment since it is just a web server response. Conversely, we can expect that Transaction B is going to have a much longer response time and use more resources, including the database. When the load tool is executing a script, it waits for the response of a transaction before executing the next transaction. You can see how the response times of transactions affect throughput. The faster the response time of transactions, the higher the throughput. You can easily manipulate a test so you must create conditions that truly mimic expected production activity.

Virtual users should mirror real users

To accurately mimic expected production activity you must first ensure each virtual user represents each real user. If you expect a concurrent load of 2500 users actively using your web application, then you need to have a test which ramps to 2500 virtual users. This is extremely important because every virtual user has a unique footprint on the backend servers in sessions, memory usage, open sockets, etc. Trying to get away with a higher throughput test without an accurate number of virtual users will lead to inexact resource usage on the backend. Only a test which uses the true numbers of users will emulate the right load conditions.

There are typically different “types” of users per web application: shoppers, buyers, admin, etc. When setting up the test, create a population, transaction mixture, that represents the workload during peak usage in production. For example, 50% shoppers, 40% buyers, 10% admin. The most accurate transaction mixes are determined by log reviews or by the business analysis of expected usage in production. The scripts executed by the tool’s load generators need to represent “true” user profiles. The scripts need to follow transaction flows: navigations, decisions, inputs, calculations, etc. These transactions need to use dynamic data to be realistic. Dynamic transaction flows include choices of different products, link requests, form submissions, or even more complicated things like extracting response data to be used in subsequent requests. It is the dynamic scripts that emulate the diverse activity of real users.

Include think time

To make those robotic virtual users act like human beings, pauses/delays need to be incorporated into the scripts. We all need to think, take in information, process it, make decisions, type out forms, etc. These “breathers” contribute to the accurate load characteristics. During these pauses, the servers are still performing housekeeping: closing ports, garbage collection, gaging timeout, sweeping sessions, etc., all of which take resources. If you hit the deployment with simultaneous users and simply crank up the throughput, the load characteristics are not realistic.

With today’s rich Internet applications (RIAs), there is the requirement to incorporate complex behaviors into the scripts. Representation of asynchronous updates of data being pushed from servers to browsers and vice versa, independent of full-page refreshes. The tool needs to “listen” for updates and re-create a script that emulates the activity. This is a hurdle in performance testing but also critical in creating the right load characteristics. These rich behaviors affect throughput by usage polling, streaming, and other reactive mechanisms which must be accounted for.

Connection speed matters

Another factor to consider is the fact that users are connecting to web and mobile applications via all different network speeds. The connection speed affects the download rate. The slower the download, the higher the response time. For your populations of users, choose the bandwidth(s) that really represent the end-user connections. For example, you may be testing a LAN application with a known bandwidth restriction. Or there are a group of users connecting from another country. A certain percentage may only access a web application via a mobile network.

Taking all these factors into consideration (user profiles, accurate number of users, transaction mixes, dynamic data, think times, bandwidth simulation, behaviors, etc.) sets the stage for creating realistic throughput. Review your scripts and make sure your tests really represent expected production. Once you have designed realistic tests, you can execute and properly evaluate whether or not the web application can achieve the set throughput goals.

This blog was originally published in 2012 and was refreshed in July 2021.


Tricentis Staff

Various contributors

Date: Jan. 06, 2020

Related resources