By Deb Cobb, Enterprise Product Manager
The setup and management of the software testing ecosystem are one of the most prominent challenges testers face. Although most load and performance testing activities are executed in pre-production or QA environments, seasoned testers acknowledge an exasperating truth — no test lab can completely mirror the production environment. The purchase of duplicate production hardware for testing is too expensive and the process of cloning production data too cumbersome.
The reality is that testers can only approximate their production environments by setting up testing ecosystems comprised of a complex assortment of best-of-breed solutions stitched together to form a coherent testing environment. As a result, performance teams accept that they need to apply both preproduction and production tests to their projects.
Given the challenges of setting up comprehensive test environments that duplicate production configurations, more companies and consultants are testing in production. Although testing in production provides a solution to the environmental problem, it requires that testers adhere to best practices to test successfully and manage the many risks associated with this approach.
Shift Left vs. Shift Right testing
With the adoption of DevOps, there is an abundant discussion about testing early in the pipeline (i.e., Shift Left testing). Although testing early and often is essential and encouraged, it’s only one element of a complete and comprehensive testing strategy. Product teams who practice agile testing methodologies are also embracing Shift Right testing, which is testing in production.
Performance, load, and stress testing push every facet of the application to its limits. The intent is to discover what it can handle, what user experience it will manifest, and what bugs are revealed. The problem is that pre-production test conditions don’t always accurately mirror the production environment. Testing in production builds another layer of validation into software releases; teams can evaluate the behavior of new releases “in the wild.” Testing in production not only ensures full test coverage but also fosters more resilient software.
Don’t launch tests without knowing the consequences
Measuring the performance of applications tested only in the production environment can be risky. To avoid costly performance issues, we recommend testing in the QA environment as well. Also, teams need be on hand to react immediately, depending on the impact the test has in the production environment.
Monitor infrastructure during test in production
Testing in production requires constant monitoring of the entire architecture. Performance engineers must have holistic insight into the health of the production environment to:
- Stop the test to avoid significant production issues
- Correlate and identify bottlenecks in the application
Moreover, setting up proper monitoring won’t be enough to secure the test in production; specific resources are required to make the right decisions. Such resources include compute/CPU cores, storage at all tiers, networking bandwidth, and so on. The point is that testing environments are often far smaller than their real production counterparts. Assuming that testing on a couple of white box servers with the same operating system/database versions, a few switches, and no-load variance (other than what the load test software generates) equates to being confident that the application will perform under real-world loads in a real-world production environment is fallacious.
Understand that testing in production can impact real users
We all agree that load testing in production can affect users or business processes already working in the environment. The impact is linked to the primary test objectives. For example, there would be a more significant impact on real users with a limit test than with a constant load model. Whatever the test objectives, performance engineers must consider the effect of the load they apply and manipulate it accordingly. Real users accessing the application during these tests add noise to the results, obfuscating performance test analysis and interpretation. With this mix of real and virtual traffic, it can be challenging to identify the cause of the performance issue, as it could be caused by:
- The load applied by the test
- The business process called by the production traffic
- A combination of both
Excessive noise in the environment will only make understanding the performance testing results that much more difficult. To avoid such issues, run load tests during low-traffic hours or after deploying a new release of the application.
Don’t generate load on third-party applications
Generating load on a third-party application can indirectly create load on that partner’s environment. This not only has legal ramifications but also may result in the third party blocking or blacklisting the traffic during the test, generating errors that affect load testing objectives.
Therefore, most testers remove requests that direct to the third party. Keep in mind that this workaround will slightly alter all response times retrieved during the test.
Create testing data on the production environment
Load testing usually requires a large dataset to generate representative traffic (e.g., login, products, etc.), and it is often challenging to generate this data for use in a production environment.
Some business transactions will generate data in the back-office systems of the company. If we look at an e-Commerce website, validating orders can feed the back office with testing data and could connect to/from back-office services of the company.
To utilize test data in the production environment:
- Disconnect from production and connect to a testing database instead. (This is possible when the environment is not connected to several applications.)
- Create a specific testing account in production that is testing-dedicated. Note that this can sometimes be difficult and may not even be possible in the production environment.
- Avoid test steps that generate records in the back office (e.g., avoid validating a test order).
Implement your tests during light-usage time windows
Review your analytics to determine the best time to execute tests. If you cannot access accurate analytics, consider testing during the night (e.g., 11 PM–5 AM ET), after deploying a new release, or during designated maintenance hours.
Use service virtualization or a testing database
Testers should consider the impact of external service calls on their applications. The impact can be significant depending on the number and utilization of the calls, which may perform high demand, resource-consuming processing work. Understanding how these calls affect performance is critical. To guarantee the efficiency of the load test by examining features such as these, we recommend employing service virtualization, which will allow testers to replace the third party with a service emulating the response of the third party or back office.
To avoid impacting business data, remove all interactions with third-party services and backend systems.
In an ideal world, the more closely the testing environment matches the production environment, the more accurate the performance benchmarking results. However, creating a test environment that is an exact replica of the production environment is practically impossible. Therefore, by definition, testers glean unrealistic performance results from their tests due to key differences between testing and production environments.
Although most companies avoid testing in production (TiP) based on the potential impact on real-world user activities and data, testers can reduce the impact by following team-based and process-based best practices.
TiP ensures that the:
- Expected load is supported by the live environment
- End-to-end user experience is acceptable
- Network equipment or the CDN can adequately handle the anticipated load
When it comes to testing in production, testers need to proactively monitor the application under test. By monitoring, we are not referring to retrieving technical counters on the architecture, but measuring the end-user performance on a regular basis. Synthetic monitoring, for example, has the advantage of allowing QA to run one single user journey from several locations, all the while alerting testers about abnormal response times. Monitoring helps operations identify and resolve production issues without these issues having to be detected by real users.
This blog was originally published in May 2018 and was last refreshed in July 2021.
Deb Cobb’s profile on LinkedIn