Cloud Cost Testing

Cloud cost testing involves exercising realistic workloads on pre-production environments then analyze the impact results for cost efficiency.

Reason for Topic

As organizations increasingly migrate to the cloud, they often face the challenge of managing cloud costs efficiently while delivering high-quality applications and infrastructure. It is not an assumed outcome that, by moving workloads to a cloud, there will be any financial savings; in fact, your costs often increase. Similarly, engineers already familiar with datacenter and on-premise logistics (but not entirely responsible for all the details of migrating workloads to a cloud) won’t have the faintest idea of how much they cost in cloud terms (e.g. compute cycles, network transfer, gateway/ingress fees, storage tiers).

Introduction / Definition

One way to tackle this challenge is through cloud cost testing, a practice that can help organizations save time while improving quality engineering practices. Cloud cost testing involves exercising realistic workloads on pre-production environments then analyze the impact results for cost efficiency. By prioritizing inefficiencies using cost analysis first, organizations can identify areas where performance can be optimized, which can help them save time (by prioritization) and increase overall efficiency (learning how to spot future inefficiencies up front) by addressing cost concerns before they become major problems.

Closely related to cloud cost testing the topic of cloud cost management platforms which seek to provide as close to real-time information about spending on cloud infrastructure. While it’s critical to have some solution to providing teams visibility over their cloud costs, cloud cost reporting is often structured in a very infrastructure or tenant focused manner. When cost analysis is decoupled from the logical usage of applications or mission-critical workflows, resource usage can appear arbitrary. Similarly, consolidating multiple teams and infrastructures into a single ‘shared costs’ model can easily strip the ‘who used what and why’ information from internal development and operations teams’ discussions. Though a product team may not be responsible for paying the production infrastructure bills, the team management should always be able and made to review what the ongoing cost of operations is for a product or service; otherwise ‘not my problem’ leaves balancing a codebase’s cost and value to those who don’t know both and will inevitably make decisions based on cost alone.

Note: using cost analysis as a prioritization ‘lens’ focuses on one dimension in what often is a multi-dimensional problem space, including performance, scalability, efficiency, and other constraints on teams. However, the cost ‘lens’ can simplify the process of finding areas for optimization such that other nuances can be factored in as needed per team and organization.

Benefits & Examples

The benefits of cloud cost testing extend beyond simply financial improvements; it often results in identifying areas of time savings, risk reduction, and architectural improvements future efficiencies.

For example, by identifying underutilized resources through cost analysis, organizations can re-allocate those resources to improve application performance, which can ultimately save time (e.g. fewer support calls and on-call incidents stemming from reliability issues) and improve user experience. In addition, by identifying cost savings using reserved instances or spot instances, organizations can reduce the time it takes to provision resources (usually smaller instances are faster to spin up) and improve scalability (by understanding where better auto-scale semantics could be applied).

Optimizing infrastructure through cloud cost testing can also help organizations identify and address potential compliance issues. For example, by analyzing cloud usage and identifying areas where compliance requirements are not being met, low use and non-production overspend, organizations can make the necessary changes to ensure compliance unnecessary surface area for attacks.

In addition, by identifying areas where performance can be optimized, organizations can improve their overall architecture. As new technologies and methodologies arise, having a cost-centric view during architectural decisions for both new and existing systems encourages critical thinking in engineering teams, not just senior architects. By continuously testing and optimizing cloud infrastructure, organizations can stay ahead of potential issues and improve the overall quality of their cloud applications and infrastructure.

Drawbacks / Gotchas

One drawback to cloud cost testing is that, well, it takes some effort and resources. Performance and end-to-end tests don’t write themselves, but if these are already in your pre-production plan and with the right testing technologies in place, assets can be reused across environments. Likewise, the domain knowledge and skills required to do this can be tricky to acquire or retain; however, like with the test assets, the human-related ‘tribal knowledge’ can not only be shared between environments, but between teams and projects as engineers grow their competencies.

Summary

Cloud cost testing is an essential practice for organizations looking to manage their cloud costs and improve their quality engineering practices. By prioritizing inefficiencies using cost analysis first, organizations can achieve greater efficiency, save time, improve compliance, and deliver high-quality cloud applications and infrastructure.