Keebo | How to Maximize Snowflake Cost Savings without Sacrificing Performance

How to Maximize Snowflake Cost Savings without Sacrificing Performance

There’s a common concern among data engineers that Snowflake cost savings come at the expense of performance. On the surface, the idea makes some sense: when you start constraining resource use, any area  where you aren’t wasting resources takes a hit.

But that’s an overly simplistic and surface-level way of thinking about it. If you dig a little deeper, you’ll find the reality to be far more nuanced. 

And, in many cases, you can practice robust Snowflake cost management without sacrificing performance. 

The trick to find this elusive middle ground is threefold:

  • Dynamic performance monitoring
  • Predictive analytics through artificial intelligence
  • Real-time adjustments, often down to the minute

If you implement these three practices, you’ll end up achieving an optimal Snowflake setup for the lowest possible cost. Read on to see how you can make that happen. 

Why the typical approach to balancing cost savings and performance doesn’t work

Striking a balance between optimization and performance is more than just adjusting dials and A/B testing configurations. In fact, the typical approach is deeply flawed and seriously hinders organizations’ cost savings potential. 

Most cost optimization tools create hypothetical scenarios based on past configurations, then estimate the credits required to maintain that configuration, then perform a differential analysis against the actuals. However, Snowflake’s complexity means that hypotheticals and reality rarely align. 

Here’s an analogy I like to use. Imagine you head home from the office at 5pm. To choose the fastest route home, you search Google Maps, and it tells you the freeway will take 40 minutes, but the local streets will take you 60 minutes. 

But Google can’t guarantee those travel times. It simply takes the (limited) data at its disposal and offers a hypothetical. The only way to know which is faster is if your clone got in the same car, left at the same time, and drove the same way—just on different streets. 

Likewise, if you want to calculate with 100% accuracy how much your Snowflake optimizations save you, you can’t rely on hypotheticals. Instead, you would have to replicate every warehouse in the exact sizes, every query, then test optimized vs. unoptimized queries against each other. What’s more, you’d have to do this daily.

You can see the problem. Because in trying to estimate how much you’re saving on Snowflake and your optimizations’ impact on performance, you’d end up doubling your costs. Which defeats the purpose of the optimizations in the first place.

3 components of balancing Snowflake cost savings and performance

So what’s the alternative? Keebo uses an automated, AI-based tool to not only measure and validate cost savings potential, but act dynamically so you don’t sacrifice performance. As mentioned above, here are three components that enable us to avoid the mistakes mentioned above. 

1. Dynamic performance monitoring

Most of the time, Snowflake cost optimization recommendations come from consultants who assume static platform configurations. But that’s not realistic. 

Different resources vary their consumption at different times and in response to fluctuations in user and query load. This can include: 

  • User count fluctuates (ideally grows)
  • Queries become more (or less) frequent and complex
  • Your rapidly scaling platform requires additional cloud resources
  • You outgrow your current warehouse space and need to provision more resources

In other words, Snowflake is a dynamic environment, and a one-size-fits-all static solution just won’t work. Most attempts to balance savings and performance fail because even if you strike that careful balance, your environment will change and that balance will be lost. 

Unless you have a crystal ball in your back pocket, all you’re doing is optimizing for what you know today. But tomorrow is different. So the first component of achieving that balance requires building your optimization solution to respond quickly and dynamically to changing circumstances. 

2. Real-time, down-to-the-minute adjustments

Changes to your Snowflake environment happen quickly, often minute to minute. If you don’t act fast enough, you’ll miss key optimization opportunities. And while they may only save you $0.50 at a time, compounded over multiple warehouses over hours, days, and weeks, those savings can compound exponentially. 

Making these changes in real time isn’t something you can do manually. You need an automated, AI-powered solution to make these changes in real time. 

For example, one area where companies overspend is warehouse size, namely because they would rather over-provision and maintain performance than maximize savings. But with an AI, you really can have the best of both worlds. 

Let’s say you have a default warehouse size of Small. An AI can figure out at 1:37 runtime that the warehouse is underutilized, then downsize it to an X-Small, which can handle the reduced workload with no negative impact on performance. But after, say 12 minutes, the AI sees an uptick in query load, it can revert to the Small. 

In this case, those 12 minutes of running queries on an X-Small can cut your credit usage in half. And if you’re using Keebo, we can validate the savings generated by every one of those adjustments, down to the minute. 

3. Predictive optimizations

But there’s even better news: you can optimize your performance even further by using an AI that not only makes adjustments based on query upticks as they happen, but makes predictive adjustments before they happen. 

One example of this is with auto-suspend. A common mistake Snowflake users make is to set their auto-suspend as low as possible. While this makes sense on the surface, it can actually be detrimental to both performance and costs. Suspended warehouses mean lost cache, which causes the next query to potentially use data from cold storage. In fact, sometimes you spend more money with a lower auto-suspend because your warehouse is thrashing and you’re constantly losing your cache and the next query has to load data off S3 again. 

The alternative is a predictive approach to auto-suspend. There’s a manually set maximum upper limit, but if the AI determines based on predictive analysis that there will be no incoming queries within that limit, it can shut down the warehouse sooner. 

What’s more, an AI can be configured (which Keebo is) so that if queries hit a cold warehouse, your spend will be no greater than if you maintained your default value. In other words, no additional costs are incurred, but the potential for savings is significantly higher—all the upside with none of the downside. 

Final thoughts on balancing Snowflake cost savings with performance

As you’ve probably figured out, achieving a balance between Snowflake cost savings and performance is a nuanced process. But with the right tools, you can not only strike that balance, but have full transparency into each transaction—so you can repeat those optimizations even as your usage evolves. 

To see how Keebo helped Freshworks achieve 10-15% savings while increasing their workloads, read the case study. 

Author

Keebo | How to Maximize Snowflake Cost Savings without Sacrificing Performance
Collin David
Articles: 5