Keebo | Why Snowflake Costs Get Out of Control—and How to Stop It

Why Snowflake Costs Get Out of Control—and How to Stop It

On paper, cloud computing seems like a CFO’s dream: limited overhead, flexible pricing, scalability. Plus, if you can maintain or increase system performance, engineers and users stay happy. It’s a win-win, right?

Instead, it’s common for CFOs to look at your quarterly expenses, and realize that Snowflake costs have gotten out of control—sometimes 200% and 300% more than expected. It can feel like a bait and switch.

However, your engineers and IT team were right: cloud computing is flexible, scalable, and cost effective. But that doesn’t happen automatically. Only when you invest in optimizing your Snowflake costs can you keep them from getting out of control. 

Here are four reasons why it’s easy to lose control of Snowflake costs, and some tips on how to avoid these pitfalls. 

1. Cloud over-provisioning

Over-provisioning has always been a problem in computing, as on-prem systems have struggled with it for years. That problem doesn’t go away when you move to the cloud. In fact, it can become even more pernicious. 

Unlike on-prem servers, the cloud offers virtually infinite scalability. So while, yes, you can scale down to control spend, the opposite is also true—and there’s no cap on how high you can go. And given that Snowflake incentivizes buyers to pre-purchase credits at a discount (Capacity), it’s very easy to end up with more cloud resources than you need. 

What’s more, expectations rarely align with reality: your query count, query complexity, and usage will always fluctuate. It’s hard to tell in advance how much compute you’ll need, so most organizations will simply buy more than they need

Whether we’re talking $10K or $1M in additional spend each month, that’s money that could be put to better use elsewhere—opportunities your organization currently isn’t pursuing.

The solution to cloud over-provisioning is to scale down whenever possible, as often as possible. Read this article where we’ve laid this out in more detail. 

2. Under-provisioning at Capacity

As mentioned above, expectations often don’t align with reality. While many Snowflake users compensate for this by over-provisioning, others take the opposite approach and under-provision. 

Although it doesn’t seem like it at first, under-provisioning can actually increase your Snowflake costs. While you’ll spend less on that initial pre-purchased credit count, once you exceed that amount, the price per credit goes up dramatically. 

So the choice to under-provision has left you with two options: either sacrifice performance to keep your credits low, or pay the higher rates when you exceed your limit and incur costs. Either way, there’s a cost: one indirect, the other direct. 

The way to avoid under-provisioning is to reduce your Snowflake spend without sacrificing your performance. Read this article for our tips on how to do that.

3. Lack of visibility into cost savings

Comparing the costs of different Snowflake states can be challenging. This can obscure the reason behind usage and cost increases, which in turn makes it more difficult to control them. 

Here’s an analogy we often use at Keebo. Let’s say it’s the end of the day, and you leave your office at 5pm. Your options are to take the freeway or the local streets, so you look at Google Maps to see which route will be fastest. But Google is only providing an estimate based on the information at its disposal—it can’t guarantee which route will actually be faster. The only way to do that is to put your clone into the same car, have you both leave at the same time, and compare who arrived first. 

The same goes with Snowflake. If you want to actually compare cost savings with 100% accuracy, you would need to replicate every single warehouse in the exact same sizes, replicate every single query, and test the optimized vs. unoptimized queries for performance. Which means you’re duplicating your spend, which defeats the purpose of the cost optimizations in the first place.

If you can’t accurately estimate the costs of various performance options and configurations, then you’ll have a major blind spot that keeps you from controlling your Snowflake costs. 

So what’s the alternative? 

The best solution is to adopt a dynamic model based on states that are more or less likely to reduce costs based on analysis of historical data. For example, let’s look at warehouse auto-suspend. You can have an AI model that adopts a dynamic approach to auto-suspend. There will be a maximum upper limit—say, 30 seconds or so—but Keebo will look at the incoming query load and usage stats, and can determine whether the full 30-second auto-suspend is needed, or whether shutting it down at 13 seconds (as an example) can be better. 

One of the safeguards Keebo has in place is that our AI-powered algorithm is set to guarantee whatever cache rate exists with the default setting (in this case, 30 seconds), the same value will be delivered with the auto adjustments. 

So let’s say the default sits at 30 seconds. If Keebo changes the auto-suspend from 30 to 10 seconds, Keebo guarantees that the number of queries that hit a cold warehouse are the same as what you would have hit with your default value. In other words, no additional costs are incurred, but the potential for savings is significantly higher—all the upside with none of the downside.  

With this approach, you can get a real-time window into alternative configurations and the cost savings of each. From there, it’s easier to see which approaches are the most cost-effective and prioritize them. 

4. Relying exclusively on optimization advice

Recommendations, while important, only go so far. You more than anyone knows just how valuable your data engineers’ time is. So, the question you need to ask yourself is: are constant manual adjustments to your Snowflake instance the best use of their time? 

Because that’s the situation you’ll inevitably end up in. Snowflake configurations are so complex that it’s practically impossible to hand them off to an intern or even junior engineer. The kinds of high-stakes decisions involved here require someone with intimate knowledge of the system and the expertise to make them quickly. 

But those time investments add up—and quickly. And, unfortunately, given the complexities already mentioned above, optimization becomes a full-time job. Because you can optimize for what you know today, but tomorrow’s different. Any number of changes—most of which are outside your control—can change your resource needs. 

So what’s the alternative? Instead of relying on humans to make real-time changes they can hardly keep up with, use an AI algorithm to monitor dozens of parameters, adjusting them dynamically to avoid over- or under-provisioning. Most importantly, these algorithms work constantly, saving you pennies here, and dollars there. 

If you adopt a proactive approach to Snowflake cost optimization, you’re going to find it easier to keep costs under control and maximize every dollar you invest into the cloud. Then you’ll realize the scalability—and full potential—of your cloud infrastructure. 

If you’re ready to get your Snowflake costs under control, Keebo is here to help. See how our automated optimization tool has saved clients significantly without compromising performance

Author

Collin David
Collin David
Articles: 5