Keebo | How to Get the Most Value Out of Your Snowflake Credits

How to Get the Most Value Out of Your Snowflake Credits

Snowflake, like any data cloud, has a low barrier to entry. No matter your skill level, it’s easy to set up, implement, and expand. This low barrier also means it is easier to bring on additional users, data, and use cases. It’s easy to say “yes” as a Snowflake administrator! Because of this, if you don’t manage your usage and consumption of Snowflake credits, the ballooning costs can cause quite a shock. 

So how do you keep those costs under control? The solution, as with any pay-as-you-go product, is simple: use it as little as possible, as small as possible. 

Granted, that “simple” strategy is easier said than done. Let’s face it, amid other high priority tasks, Snowflake cost optimization often falls by the wayside. 

That’s why it’s critical to adopt a strategic, automated approach to getting the most out of your Snowflake credits. This article will walk through how exactly to make that happen.

Why cost optimization is critical when using pay-as-you-go cloud platforms

No matter which data cloud you use, controlling costs is always a challenge, especially with a pay-as-you-go model. To understand how this happens in Snowflake specifically, we need to take a closer look at how Snowflake pricing is structured

Snowflake pricing is based on credits which are determined by your warehouse size. There are ten sizes, each of which doubles in size (and cost) as you progress through them. So the X-small warehouse costs 1 credit per hour, the Small costs 2, the Medium 4, all the way up to the 6X-large warehouse at 512 credits per hour. 

There are two options for purchasing Snowflake credits: Capacity or On-Demand. 

  • Capacity. Users pre-purchase a set number of discounted Snowflake credits, and pay significantly more for any credits they use over  that number.
  • On-Demand. This is a true pay-as-you-go structure, where Snowflake credits have a higher individual cost, but you only pay for what you need when you need it.

The discount for buying at Capacity is compelling enough that most users choose this option. The trick, then, is to stay within your capacity while also supporting your ambitious growth plans, which naturally requires more resources.  

But for every Snowflake user, the day-to-day reality often differs from your initial plans. As you add users and data, your costs may run higher than you originally anticipated. 

So how do you avoid paying the steep penalties as your need outpaces your plan? Just as we said above: use it as little as possible, as small as possible. This requires a strategic cost optimization strategy to ensure you don’t use more Snowflake credits than you absolutely need. 

3 reasons why Snowflake users often overspend on the cloud

But why do costs get out of control so quickly? Let’s take a look at three specific reasons why, as you grow, you’ll likely start to see costs get out of control. 

1. Data overprovisioning

According to a survey from HashiCorp-Forrester, 59% of data cloud users overprovision their  resources, making it the leading cause of cloud overspending. Although overprovisioning isn’t unique to the cloud—many on-prem data centers face this same challenge—cloud solutions have a lower barrier to entry, plus the nature of the cloud makes it virtually limitless. Those two factors combined make it far easier for cloud resources to get out of control. 

There are a number of common scenarios that can cause Snowflake users to overprovision:

  • Capacity discounts. As mentioned above, users may purchase more Snowflake credits than they need, simply to ensure they never have to go above their limit. 
  • Faster performance. As user expectations of their online platforms continue to increase, engineers and data managers often err on the “safe” side by prioritizing performance over cost, running larger warehouses than they really need. 
  • Internal demands. In addition to external user expectations, often internal stakeholders will overestimate the resources required, and often those estimates can be well off the mark. 

2. No proactive resource management

Given how easy it is for cloud resources to get out of control, proactively monitoring and managing Snowflake credits is mission critical. Unfortunately, many organizations don’t treat it that way. 

There are two main reasons why. First, the average data engineer is far too busy to prioritize the degree of monitoring and adjustment necessary to keep costs under control. Second, even organizations that invest in personnel that focus solely on cost optimization don’t have the tools necessary to truly maximize the value they can get out of Snowflake credits.

3. Lack of cost optimization resources and tools

Too often, data managers lack the resources and tools to optimize Snowflake costs. They rely on human intervention to manually adjust resource availability and consumption when performance demands change. 

However, this means that optimization only happens when there’s a human available to make necessary changes—only a small percentage of a platform’s typical 24/7 uptime. Without automated resources to make real-time adjustments, data engineers are essentially operating with one hand tied behind their back. 

And even when they do have time to make an optimization, what is effective today may not work for tomorrow’s workload, leading to performance issues and repeat work for the data team.

How to effectively manage your Snowflake credits

Optimizing the use of your Snowflake credits requires two basic steps: 1) actively monitor the system for over- or under-resourcing, and 2) rapidly adjust resource usage, ideally through automation. Most Snowflake cost optimization tools, including those provided by Snowflake themselves, focus on Step 1, but don’t provide the tools to streamline Step 2. 

Active monitoring

I already mentioned the gap between Snowflake usage expectations vs. reality. The only way to gauge your real usage is to monitor it actively:

  • Bytes scanned
  • Compilation and execution times
  • Spillage
  • Query throughput
  • Concurrency
  • Warehouse utilization
  • Data loading performance

By keeping an eye on these seven metrics, you can get a handle on your Snowflake usage and quickly identify areas where you may be over- or under-provisioned. 

Rapid adjustment

Identifying areas for improvement is only the first step. The most efficient way to control costs is to act on those “heavy hitter” users and applications as quickly as possible by rapidly adjusting  your data cloud parameters. This includes making changes to data warehouse size and suspend settings. Then, once the “heavy hitting” period has passed, revert to your previous state. 

The faster you make these adjustments, the more control you’ll exercise over your costs. But there’s a problem: manual, human intervention can only happen so fast. Snowflake charges by the minute after the first hour, which means even slight delays can incur major costs.  

Automated action

The solution, then, is to automate action on your Snowflake performance metrics. This requires having an AI-based tool that continually monitors those metrics listed above, then makes the adjustments right away. 

Many data engineers are understandably concerned about handing over control of their Snowflake system to a “robot.” But there many ways to maintain control over the process:

  • Transparency. Access to records and real-time visibility into automated Snowflake adjustments can let you review performance, including identifying those instances where AI encountered conditions it didn’t expect. 
  • Settings controls. Based on those reports and your review, you can adjust the AI’s settings, including prioritizing cost reduction or certain performance levels. 
  • Security. Always keep your data secure by working with reputable partners that have achieved appropriate data security certifications, including and especially AICPA’s SOC2 standards

Example: how Hyperscience achieved 50% Snowflake savings with Keebo

What I’m talking about isn’t theoretical. This is a proven process that has driven real results for companies of all sizes. For example, Keebo has helped Hyperscience, one of our clients, achieve 50% Snowflake savings through our automated cost optimization platform. 

By enabling real-time cost savings, Hyperscience has been able to increase their Snowflake workload and focus on those analytics that drive real value for the business. This enables Hyperscience to predict customer churn and derive more value from their data. 

Learn more about how Keebo helped optimize Hyperscience’s Snowflake credits here. 

Author

Keebo | How to Get the Most Value Out of Your Snowflake Credits
Skye Callan
Articles: 9