Blog
The Cloud Is Not for Cost Savings, But Cost Management is King
There are countless benefits to migrating to the cloud, expanding your presence and planting deep roots. Amazon Web Services (AWS) has proven to be a groundbreaking platform, perfect for incubating new ideas and scaling them rapidly as they take off. Tremendous creativity can be unleashed by utilizing the flexible, seemingly infinite capacity of AWS. As plentiful as the reasons are, saving money is rarely one of them.
There are certainly opportunities to realize some cost savings across an overall infrastructure, and it cannot be argued that there are savings when starting new projects while also freeing capital and cash flow. But these are the exception to the overall rule. In fact, most complex workloads cost more to run in the cloud, even after optimization occurs. The opportunity cost of the cloud is extremely low when you factor in the speed to market and the increased effectiveness of your team. But the idea that the cloud will reduce bottom-line infrastructure spend is an enduring myth. Don’t count on it.
Organizations often move to the cloud to reduce spend
The expectation is that savings will occur over time, after optimization. They are often surprised to see their migration and operating budgets blown apart at the point of initial migration, with insufficient relief from efforts to optimize for the cloud. Considering this, it is crucial to realize cost savings opportunities wherever they lie and tightly manage spend in a way that most organizations never needed to before. AWS and other public cloud providers offer a form of utility service, and being careless with such vendors can have a financial result akin to leaving your HVAC system running full blast when you’re on vacation. Times a thousand. Or more.
Key Cost Risks
These three areas have the most potential to blow up your budget.
- Traditional infrastructure is expensive in the cloud. For the most part, cloud providers buy the same data center space, network infrastructure and servers you do. Their true benefit is in layering value-added services on top of it that reduce your development effort and operational overhead. This results in tremendous efficiencies for services that do not need a lot of resources, such as microservices or even reduced footprint development environments. On the other hand, beefy data warehouses, NoSQL clusters, Hadoop and even routine enterprise applications that are resource hungry require a significant premium. Strategies to offset these concerns range from hybrid architecture, leaving expensive system components behind in the data center, to cloud optimization, replacing those components with serverless or containerized services.
- Budgeting is unfamiliar terrain. A small set of fixed capital costs turn into a menu of variable costs, most priced per second and per penny. AWS has more than 150 products and services, and even a moderately complicated environment is likely to use a dozen or more of them. Each is priced based on a consumption model, further complicated by the opportunity to prepay in the form of reservations to obtain a discount. In the past, EC2 instances were the dominant figure on a bill and often the only part of spend that was looked at by companies. Today, many companies run significant applications without ever consuming EC2 directly.
- Scaling up creates surprises. Teams often size their production environments based on the same rules of thumb they used in the data center. But a variety of factors such as shared hardware, network and storage latency, and even different measurement standards can make the difference between a test environment and production environment far larger than you’re accustomed to. It is not uncommon to think optimization was completed, just to start a new round (or two) of it once production load hits the system and blows cost out of control. When this happens, it is important to study the load and learn what factors drove scaling unexpectedly. While scaling often results in adding more instances to a scaling group, application design is usually the root cause.
Where To Focus
So, where should you focus your attention in managing cost when executing your cloud migration? These four areas are a great starting point:
Find the bottleneck. We’ve seen it enough times at this point to call it a rule and not a coincidence. Cost overruns in the cloud are the result of a bottleneck scenario — one constraint that is expensive and driving scaling activity. Identifying the bottleneck is crucial, but low-hanging fruit often creates a distraction. Low-hanging fruit — minimizing cross-AZ transfers, batching operations and using correct storage types — are often a distraction from the true bottleneck. Ruthlessly focus on identifying and eliminating your top bottleneck, then repeat. Once the factors that are most significantly driving cost are reduced, return to the low-hanging fruit to make sure you are sized right and taking obvious steps to minimize your bill.
Implement cost management processes. There are great tools on the market that go beyond what AWS offers for cost management. They’re typically priced as a small percentage of your AWS spend and if used correctly will pay for themselves many times over. A tool alone is not enough, though. Build a meaningful process around your AWS spend that includes periodic review of spend per account, configuring budget alerts, and estimating and then validating spend on new projects.
Optimize for the cloud. We have had to help too many clients extricate themselves from extraordinary cost overruns to recommend shifting unoptimized workloads to the cloud. Lift-and-shift is too expensive and risky for most systems. This does not mean you should move only after a cloud-native rebuild. Instead, identify areas likely to be too costly in the cloud, such as high I/O applications and unusually large, standalone systems. Instances in the cloud are considerably less reliable than virtual machines and an order of magnitude less reliable than physical servers. But their low cost allows for many instances to be run at once instead. Systems that instead run as a single beast doing all the heavy lifting often require expensive strategies in the cloud. Such dependencies should be eliminated or reduced before cloud migration begins, or they should be left behind in a hybrid model.
Implement governance at the right time. Governance can gum the wheels of a nascent cloud migration, but a lack of governance also can create a menagerie of differences that make organizational cost management impossible. Governance must be introduced at the right time and should start slowly. Adopting account-level standards, including regular team cost reviews, alongside bill tagging and organizational policies, can ensure that financial planners and other executives have effective visibility into spend.