Strange but True Stories of Cloud-Cost Optimization
The Internet is awash with generic guidance about managing cloud costs. You can find plenty of articles on how to track cloud spending, how to rightsize cloud workloads from a cost perspective, and so on.
But sometimes, the best lessons about cloud-cost optimization don't come from lists of best practices. They emerge from genuine, real-world stories about what organizations have done to reduce cloud spending and why those efforts worked (or didn't work).
Inspired by that reality, I decided to sit down and chart some of the stranger tales of cloud-cost management I've encountered in my career as a FinOps consultant. While I can't promise that these anecdotes will become the next viral stories on your Instagram feed, I can promise that they offer grounded, real-world perspective about cost-management challenges experienced by actual businesses—and how FinOps helps solve those challenges.
Read on for some colorful tales from the genre of cloud-cost management—along with lessons you may be able to apply to your own cloud-cost optimization initiatives.
The CIO Who Turned Off All the Non-Prod VMs at Once
What's the fastest way to stop overspending in the cloud? Shutting down every single non-production server instance you have running, then seeing which ones your teams end up actually missing.
That's the radical approach taken by one CIO at a company we worked with. Fed up with excess cloud spending and slow progress toward cost reduction, he instructed his engineers to spend a Sunday evening shutting down every non-production cloud VM they had running—then restore only those that turned out to be important for getting the business back up and running.
Somewhat surprisingly, the idea worked. About 30% of the cloud servers that the business shut down never came back on—because they weren't essential. That translated to tremendous savings.
To be clear, I don't recommend this approach. A much healthier and lower-risk way to reduce spending is to identify specific cloud resources that you don't need, then shut just those down. That strategy avoids much more radical actions on the part of executives to bring spending under control. But sometimes you get lucky with the nuclear option—as the CIO in this case did.
The Storage-Management Teams Who Were Strangers
It also sometimes happens that a company can be very good and very bad at managing its cloud spending at the same time.
I experienced this with a client that had recently gone through an acquisition. It now had two separate teams—neither of which aware that the other existed—responsible for managing different cloud-storage resources. One team was doing so in a very cost-effective way, taking advantage of features such as intelligent tiering to minimize cloud-storage spending. The other was not—and it was facing increasing pressure from finance leads to get its spending under control.
The solution to this conundrum was simple: We put the two teams in touch with each other so they could work collectively to reduce storage spending. (We also recommended some additional practices that they weren't yet leveraging, but that's beside the point.)
For me, the most interesting part of this experience is the way it highlights how people within large organizations sometimes don't even really know about each other—and they miss out on ways to help each other as a result. It seems crazy, but it happens more often than you might think—especially when organizations go through major structural changes due to a merger or acquisition.
The Vendor Who Was Legally Obligated to Lose Money
One of the first things they teach you in Cloud-Cost Optimization 101 (or would teach you if that were an actual course) is that one of the simplest ways to reduce cloud spending is to choose the right VM instance type for your workloads. In most cases, it's pretty easy to switch to a different instance type to strike the ideal balance between VM cost and performance.
But I once worked with a customer that was unable to modify its instance types—and not for any technical reason. Instead, the problem was that the customer—a B2B vendor that hosted applications used by other businesses—had written a contract stipulating that it would host a client's workloads on a specific type of AWS EC2 instance.
The contract was several years old, and more cost-effective instances had since become available—which we pointed out. But the customer in question couldn't switch because of its contractual guarantee. That was bad for them because they were overpaying for EC2 instances—and it was bad for our client's client because they were missing out on performance optimizations that would have been available from new instance types.
The moral of this story is that you should write contracts that guarantee your customers the best performance you can deliver. But don't hard-code specific technical configurations that may constrain your ability to optimize your spending or workload performance.
The IT Manager Who Didn't Check His Math
Another simple but effective way to save money in the cloud is to take advantage of reserved instances, which provide discounts on cloud VMs. As one of our clients discovered the hard way, however, the amount you actually save through reserved instances can vary widely.
Eager to reduce his company's cloud spending, an IT manager at the client had moved his company's most expensive instance types to reserved instances. The discounts available through those instance types were minimal, however; the company was saving less than 10%.
We did an assessment and discovered that by selecting reserved instances for other types of VMs, the company could unlock discounts of up to 50%. When you did the total math, those savings turned out to outweigh the money saved by using reserved instances for the other, more expensive VM types.
The lesson here is that it's important to perform a holistic financial assessment of your cloud spending and savings opportunities. The simplest or most obvious savings initiative may not deliver the largest reduction in spending.
The Common Thread
All of these stories highlight a common thread I see every day in the realm of FinOps: Cloud-cost management isn't as simple as collecting and analyzing spending data. There are complex human factors at play, too.
In all these tales, the human element is inseparable from the technical considerations. You need to manage both sides of the coin if you want to optimize your cloud spend.