Keebo | See How Keebo’s Autonomous Warehouse Suspension Algorithm Maximizes Snowflake Savings

See How Keebo’s Autonomous Warehouse Suspension Algorithm Maximizes Snowflake Savings

Among Snowflake’s many built-in cost reduction measures is their auto-suspend system. In a nutshell, this capability suspends inactive warehouses, keeping you from spending compute resources when they’re not needed.

Configuring auto-suspend is straightforward: use Snowflake’s SQL command to implement your settings. For example, you might use an ALTER WAREHOUSE command specify that a warehouse should suspend 45 seconds after the last query ends:

ALTER WAREHOUSE <your_warehouse_name> 
SET AUTO_SUSPEND = 45;

However, the reality is more complicated. Although Snowflake will attempt to implement this policy, you’ll end up with the warehouse suspended after 60 seconds of inactivity, not 45. The reason, per Snowflake’s documentation is, “Setting a value less than 30, or a value that is not a multiple of 30, is allowed but might not result in the expected behavior due to the 30 second poll interval for warehouse suspension.”

In other words, Snowflake only checks to see if a warehouse is active every 30 seconds. If your auto-suspend is set to 45 seconds, the warehouse won’t be suspended until the next 30-second interval, which is 60 seconds. As a result, the warehouse runs 15 seconds longer than you intended. 

Of course, 15 seconds of activity here and there may not seem like a big deal. But if you compound that over a longer period of time and multiple warehouses of various sizes, it adds up. Consider the following examples:

WH Size# of WHsAvg. extra activity for each auto-suspend (sec)  # of auto-suspends per monthTotal credits wastedTotal spend wasted (Enterprise Edition, $3.00 per credit)
XS2015450 (15 per day)37.5$112.50
M1515450 (15 per day)112.5$337.50
XL515450 (15 per day)150$450.00

Although simplistic, these examples illustrate the problem: small periods of unneeded warehouse activity add up to significant overspending. And if you’re only relying on Snowflake’s default controller, there’s no way to address this issue. 

That’s why Keebo has recently launched a real-time, automated proactive warehouse suspension algorithm. This new tool enables more fine-grained control over auto-suspend intervals. Depending on your current Snowflake configuration, this tool could help you save dollars you didn’t even realize you were wasting. 

How Keebo’s real-time, automated proactive warehouse suspension algorithm works

The key difference between Keebo’s approach and Snowflake’s default behavior is that while Snowflake suspends warehouses in 30-second intervals, Keebo performs checks and implements suspensions within 2.5-5 seconds of your target deadline. This presents a 6-12X improvement in precision over Snowflake’s default controller. 

Most importantly, Keebo’s real-time, automated proactive warehouse suspension algorithm integrates natively with our other machine learning algorithms—warehouse optimization, performance guardrails, query routing, etc.—creating a workload-specific approach to reducing your Snowflake spend. 

Here’s an example to illustrate the benefits of real-time warehouse suspension control: 

Let’s break down the sequence of events in the example above:

  1. The user set Warehouse A’s auto-suspend to 45 seconds.
  2. Warehouse A asleep at 11:29:00 AM. 
  3. Two seconds later (11:29:02 AM), a query starts, resuming the warehouse. Snowflake starts billing the user for this warehouse’s operation. 
  4. At 12:01:32 PM, the query ends. From this point on, there are no new queries. According to the user’s settings, the warehouse should suspend at 12:02:17 PM.
  5. However, because Snowflake’s auto-suspend occurs in 30-second intervals, the actual suspension takes place at 12:02:33 PM (or 61 seconds after the query ended). 
  6. As a result, Snowflake bills the user for an additional 16 seconds of warehouse operation. 

How would Keebo have handled that situation differently? Instead of operating in 30-second intervals, Keebo’s real-time controller polls the state of each warehouse to learn:

  • Whether the warehouse is running or idle
  • How many queries were last queued
  • When the warehouse was last resumed, if running

Once the warehouse has been idle for the target auto-suspend time, the real-time controller puts the warehouse to sleep using the following SQL command:

ALTER WAREHOUSE <warehouse_name> SUSPEND;

Note: Snowflake promises not to suspend a warehouse while a query is actively running. In our testing we found this is usually the case, though there are rare circumstances where in-flight queries can still fail. Keebo counters this problem by enhancing the simple ALTER command with an additional server-side check that makes sure the warehouse is still idle before suspending. This eliminates the window in which queries can fail.

Let’s return to the example above: 

  1. The user set Warehouse A’s auto-suspend to 45 seconds using Keebo’s real-time, automated proactive warehouse suspension algorithm. 
  2. Warehouse A asleep at 11:29:00 AM. 
  3. Two seconds later (11:29:02 AM), a query starts, resuming the warehouse. Snowflake starts billing the user for this warehouse’s operation. 
  4. Throughout the duration of the query, Keebo polls the warehouse to determine whether the warehouse is running or idle. (This is indicated by the rays projecting upward toward the timeline.) Note that poll (0), the warehouse is determined to be active. 
  5. At 12:01:32 PM, the query ends. From this point on, there are no new queries. According to the user’s settings, the warehouse should suspend at 12:02:17 PM.
  6. At poll (1) above, Keebo determines that the warehouse is idle and logs an initial idle observation
  7. At poll (2) above, Keebo determines the warehouse has been idle for more than 45 seconds, then sends a suspend command (3). The warehouse is then suspended at 12:02:19, or 47 seconds after the query ends.

In this example, the target suspend is delayed by only two seconds, compared to 16 seconds with Snowflake’s default controller. This has saved the customer 14 seconds of wasted billed uptime. 

Using real-time warehouse control & dynamic auto-suspend to maximize Snowflake savings

We just demonstrated how Keebo gives you the fine-grained control over warehouse auto-suspend that Snowflake’s default auto-controller doesn’t. But those savings are dependent on how you define that auto-suspend time in the first place. The challenge is that the ideal auto-suspend time varies from organization to organization, warehouse to warehouse, and function of a variety of factors:

  • The warehouse’s currently executing workload
  • Historical performance
  • Workload characteristics across the overall business (inferred from analyzing other workloads in the same account)

As a result, the optimal auto-suspend value changes over time—there’s no ideal “set-it-and-forget-it” value. This means that to maximize the savings auto-suspend gives you, that value should be updated over time. 

Another common mistake Snowflake users often make is to set their auto-suspend as low as possible. There are risks associated with a warehouse that suspends too soon. Active warehouses maintain a cache of data accessed by recent queries, enabling subsequent queries to run faster. If your warehouse auto-suspends, then a new query begins two seconds later, the query execution time will increase due to an empty cache. 

Keebo addresses this problem with a dynamic, AI-driven approach to auto-suspend. First, users set a maximum upper limit (e.g. 45 seconds). Our algorithm dynamically determines the optimal auto-suspend value in real-time. This optimization aims to minimize the number of queries that encounter a cold cache, while maintaining query performance levels comparable to those achieved with the user-defined auto-suspend setting.

What about credits consumed by the real-time controller? Won’t that offset any of our savings? 

If Keebo’s real-time controller is constantly polling your warehouses to see whether they’re active or idle, doesn’t that consume Snowflake resources? The answer is yes, but with a major caveat: polling the warehouse consumes cloud credits, not warehouse credits. 

The difference: if your cloud credits don’t exceed 10% of your daily warehouse usage, then Snowflake doesn’t bill you for them. So even though the real-time controller is increasing your cloud activity, as long as it falls within that 10% threshold, you won’t run into any issues. 

As such, we’ve structured Keebo’s real-time controller and determined that five seconds in between polls was the optimal period for reducing daily credit consumption: 

Keebo | See How Keebo’s Autonomous Warehouse Suspension Algorithm Maximizes Snowflake Savings
Keebo | See How Keebo’s Autonomous Warehouse Suspension Algorithm Maximizes Snowflake Savings

It’s impossible to predict whether we’ll exceed that 10% threshold without knowing what a user’s cloud services usage is already. However, suffice it to say that we’re able to strike a balance between optimal savings while avoiding unnecessary cloud compute, thus achieving a cost-effective solution. 

Additionally, Keebo’s polling mechanism covers all warehouses in the account. This means the cost of running the controller doesn’t increase linearly with the number of warehouses. In other words, the cost of running the controller is practically fixed. So the more warehouses you have, the more you can save without increasing your cost. 

Final thoughts on Keebo’s real-time controller

Snowflake’s flexible auto-suspend capability presents a major savings opportunity to its users—which is one of the advantages of using their data platform. But as with any tool, it’s important to understand exactly how it works and make sure you’re accurately accounting for your savings.

But here’s the best part. Let’s assume you’re running the smallest (and cheapest) Snowflake configuration: an X-Small warehouse on one cluster with a one-second auto-suspend. Even in that smallest configuration, the Keebo real-time controller still saves you money—because that “one second” is actually 30 seconds in reality. 

In other words, even if you think you’re running Snowflake as inexpensively as possible, there’s still opportunity to save. 

The best way to learn more about our real-time controller—not to mention our comprehensive Snowflake cost optimization platform—is on a live demo. Schedule some time with our team today.

Authors

Keebo | See How Keebo’s Autonomous Warehouse Suspension Algorithm Maximizes Snowflake Savings
Richard P. Spillane
Articles: 1