Snowflake Performance Tuning: What Data Teams Typically Get Right (and Wrong)
Optimizing Snowflake costs can be a multi-headed beast, and data teams have multiple strategies at their disposal to eliminate overspending. Since compute costs are often the largest expense category, Snowflake performance tuning is a common approach to reducing spend.
But for all its merits, Snowflake performance tuning is far from a silver bullet. While, yes, you have a high chance of reducing spend, the engineering resources required to do so can be significant. So it’s important to consider the cost-benefit analysis of Snowflake optimization—is the potential cost saving worth the necessary engineering resources?
Let’s walk through what Snowflake performance tuning looks like, some common tactics data teams use, and the pros and cons of each tactic.
How to monitor Snowflake performance
Before you can start tuning Snowflake performance, you first have to monitor it. While there are a range of paid Snowflake monitoring tools available—not to mention those you could build in-house—Keebo offers the core insights you need for free.
Snowflake Workload Intelligence is specifically designed to surface insights to optimize according to three key use cases:
- Understand cost drivers. See your most expensive queries and group them by patterns (we call them templates) to your heavy hitters.
- Advise your data team. Leverage insights to coach your data team to be more efficient in queries and build new schema and other structures for query development and execution.
- Stay ahead of the game. Stay informed about your “heavy hitter” users, applications, and queries so you can adjust before they balloon out of control.
- Evaluate warehouse configuration. Analyze current warehouse configurations to determine if they’re sufficient to handle existing query loads.
- Identify inefficiencies. Pinpoint costly and inefficient queries for optimization.
- Address user skill gaps. Assess whether users need additional training in selecting the appropriate warehouse for their queries and providing SQL training to ensure queries are written in the most efficient manner.
- Manage your capacity proactively. Continuously monitor workload changes and trends to ensure you have sufficient capacity to handle future processing requirements.
You can download Snowflake Workload Intelligence from the Snowflake Marketplace and run it natively: no plugins, setup, or data exchange. Plus, it’s completely secure as your data never leaves your account.
What does Snowflake performance and cost tuning entail?
Once you have a reliable way to monitor Snowflake performance, it’s time to start taking action to reduce costs. In general, Snowflake performance tuning tactics fall into three broad categories:
- Data loading
- Data transformation
- Query & task execution times
Data loading
Because Snowflake supports external stages—e.g. cloud storage in AWS S3—you’ll often need to load raw data from those stages into Snowflake. This can be done either in batches or continuously via Snowpipe. In both cases, data loading uses compute resources, which means you have to provide a warehouse to handle it.
Some common tactics to tune data loading performance include:
- Using separate warehouses / compute resources for different use cases—data loading, data transformation, SQL queries, etc.—to mitigate warehouse overhead
- Adjusting warehouse sizing to accommodate bulk loads, then readjusting once those loads are complete
- Decreasing auto-suspend time to reduce warehouses run time after queries have stopped
- Partitioning staged data files to avoid scanning terabytes that have already been loaded
In the last case, here are three distinct ways to scan data files when loading them into Snowflake:1
-- Simple method: Scan entire stage
copy into sales_table
from @landing_data
pattern='.*[.]csv';
-- Most Flexible method: Limit within directory
copy into sales_table
from @landing_data/sales/transactions/2024/06
pattern='.*[.]csv';
-- Fastest method: A named file
copy into sales_table
from @landing_data/sales/transactions/2024/06/sales_050.csv;
Data transformation
Data transformation in Snowflake involves a range of activities, including data conversion—which includes extraction and scrubbing—as well as cleansing and aggregation. Data transformation typically occurs when:
- Data is moved
- Various data types need to be analyzed alongside each other
- Information is added to existing data sets
- Users wish to aggregate data from multiple data sets
Query & task execution times
The final Snowflake performance tuning tactic we’ll consider is reducing execution times for both queries and tasks. There are a number of ways to surface a list of the longest running queries. You can manually write it in SQL:
SELECT query_id
query_text,
total_elapsed_time/1000 AS query_execution_time_seconds,
execution_time / 1000 AS query_execution_time_seconds,
partitions_scanned,
partitions_total,
FROM snowflake.account_usage.query_history Q
WHERE TO_DATE(Q.start_time) > DATEADD(day,-1,TO_DATE(CURRENT_TIMESTAMP()))
AND warehouse_name = 'my_warehouse'
AND warehouse_size = IS NOT NULL
AND error_code IS NULL
AND partitions_scanned IS NOT NULL
ORDER BY total_elapsed_time DESC
LIMIT n;
Alternatively, you can use Snowflake Workload intelligence to surface these long-running queries and optimize them in real time.
Top 7 Snowflake performance tuning tactics
Now that we’ve looked at Snowflake performance tuning at a high level, let’s dive into seven of the most popular tactics.
1. Scaling warehouses up/down
The most straightforward Snowflake performance tuning tactic is to vertically scale warehouses to provide necessary compute resources. This has the dual benefit of reducing individual query execution time, as well as execution time for the entire workload.
This typically happens when:
- Queries are spilling to remote disk
- User demands require faster query results
To find the queries spilling the most compute to remote disk, you can run the following query:
SELECT query_id,
query_text,
user_name,
warehouse_name,
bytes_spilled_to_local_storage,
bytes_spilled_to_remote_storage,
bytes_spilled_to_local_storage +
bytes_spilled_to_local_storage AS
total_bytes_spilled
FROM snowflake.account_usage.query_history
WHERE (bytes_spilled_to_local_storage > 0
OR bytes_spilled_to_remote_storage > 0 )
AND T0_DATE(start_time) > DATEADD (day, -1,
TO_DATE(CURRENT_TIMESTAMP())
ORDER BY total_bytes_spilled
DESC LIMIT 10;
To identify the longest running queries (here the top ten), you can use the same query referenced in the “Query task & execution times” section above (or use Snowflake Workload Intelligence to identify these heavy hitters):
Query execution times in Snowflake Workload Intelligence
Once you identify heavy hitter queries, you can scale up those warehouses to accommodate the demand. However, this is only step one. The next step is to scale those warehouses back down after the fact so you’re not spending unnecessarily on larger warehouses.
Yes, this is a highly involved and, frankly, inefficient process. But it’s probably one of the best opportunities to tune Snowflake performance and start optimizing costs. If you’d rather deploy your engineering resources elsewhere, consider using an AI-powered tool like Keebo to make these adjustments in real time.
2. Check and reduce queuing
Another common Snowflake performance tuning tactic is to check and reduce queuing. When a warehouse is busy executing queries, Snowflake cannot start executing new queries until adequate resources become available. This can happen for a variety of reasons: either the warehouse is overloaded, or your warehouse configuration does not allow for spinning up more clusters.
To check your queuing in your Snowflake instance, run this simple SQL query:
SELECT warehouse_name,
SUM(queued_provisioning_time + queued_repair_time +
queued_overload_time) /
1000 AS total_queued_time_seconds
FROM snowflake.account_usage.query_history
WHERE start_time >= DATEADD(day, -1, CURRENT_TIMESTAMP())
AND (queued_provisioning > 0 OR queued_repair_time > 0 OR
queued_overload_time > 0 OR transaction_blocked_time > 0)
GROUP BY warehouse_name
ORDER BY total_queued_time_seconds DESC;
Keep in mind that for runs of tasks, the USER_TASK_TIMEOUT_MS task parameter takes precedence over the STATEMENT_QUEUED_TIMEOUT_IN_SECONDS setting. The former uses units of milliseconds, while the latter uses units of seconds.
To identify individual queries that experienced queuing delays, you can leverage the following Snowflake query:
SELECT query_id,
user_name,
warehouse_name,
start_time,
end_time,
total_elapsed_time / 1000 AS total_elapsed_time_seconds,
queued_provisioning_time / 1000 AS queued_provisioning_time_seconds,
queued_repair_time / 1000 AS queued_repair_time_seconds,
queued_overload_time / 1000 AS queued_overload_time_seconds,
(queued_provisioning_time + queued_repair_time + queued_overload_time) /
1000 AS total_queued_time_seconds,
transaction_blocked_time / 1000 AS transaction_blocked_time_seconds,
execution_time / 1000 AS execution_time_seocnds,
query_text
FROM snowflake.account_usage.query_history
WHERE start_time >= DATEADD(day, -1, CURRENT_TIMESTAMP())
AND (queued_provisioning_time > 0 OR queued_repair_time > 0 OR
queued_overload_time > 0 OR transaction_blocked_time > 0)
ORDER BY total_queued_time_seconds DESC;
Once you’ve checked queuing, you have several options to mitigate this problem:
- For a regular warehouse (i.e. not a multi-cluster warehouse), consider creating additional warehouses, and route the queries among them. If specific queries are causing usage spikes, focus on moving those queries or increasing warehouse size.
- For a multi-cluster warehouse, increase the maximum number of clusters.
- Adjust the maximum concurrency level, adjusting the number of queries that can run in a single warehouse.
To set the MAX_CONCURRENCY_LEVEL for a warehouse, run this (in this example, we’re setting it at three concurrent queries):
ALTER WAREHOUSE my_wh SET MAX_CONCURRENCY_LEVEL = 3;
A few important things to keep in mind when engaging in concurrency optimization:
- Decreasing concurrency improves performance by allocating more resources per query
- Increasing concurrency can reduce performance if the system becomes overloaded with requests
- It’s important to optimize concurrency based on workload and available resources—there are no real one-size-fits-all recommendations
- You’ll need to monitor metrics like execution times, spillage, and the number of running and queued queries to identify the optimal concurrency setting
3. Improve data read & processing efficiency
One consistent pull on query performance is time spent reading data from table storage. This function involves downloading data over the network from the table’s storage location and pulling it into the virtual warehouse’s worker nodes.
To speed up this process, you can either reduce the volume of data downloaded, or scale up the virtual warehouse. We’ve already covered the latter, so let’s look at some ways to reduce data downloaded by queries:
- Reducing the number of columns accessed
- Leveraging query pruning (more on that below) and table clustering
- Using clustered columns in join predicates and pre-aggregated tables
Further, operations like Joins, Sorts, and Aggregates which happen downstream of reading data from storage can also slow query execution. Some steps to improve these functions include:
- Simplifying queries
- Reducing query operations
- Reducing data volume by filtering earlier in the workflow
- Avoiding repeated references to CTEs
- Removing unnecessary sorts
- Avoiding joins with an OR condition
4. Use result caching to reduce query latency
Another tactic in Snowflake performance tuning is to optimize query caching. The uppermost, and perhaps most vital, layer is the result cache (the other two being local disk and remote disk cache).
When Snowflake executes a query, the full results are stored in the result cache for 24 hours (this period is not adjustable). This enables subsequent queries to retrieve these cached results directly, without running the full query again. Naturally, this significantly reduces query latency and consumes fewer compute resources.
Result caching makes use of two central functions. The first is RESULT_SCAN, which returns the result set of a command stored in the cache in table format—although rows aren’t guaranteed to be returned in the same order as the original query.
The RESULT_SCAN function in Snowflake allows you to retrieve and process the result set of a previously executed query. RESULT_SCAN is particularly useful for analyzing or further processing query results without having to re-run potentially time-consuming queries.
This enables you to process output from the following:
- SHOW or DESC[RIBE] command that you executed
- Queries executed on metadata or account usage information
- Stored procedure results
SELECT * FROM TABLE (RESULT_SCAN('<query_id>'));
Another option is to use DESCRIBE (abbr. DESC) RESULT which Describes the columns in the result of a query from either your current session or past sessions within the past 24 hours:
DESC[RIBE] RESULT { '<query_id>' | LAST_QUERY_ID() }
As you can see above, DESC RESULT and RESULT_SCANE can use LAST_QUERY_ID as a context function.
5. Mitigating row explosion
Row explosion occurs when a query produces an unexpectedly large number of result rows. This is often due to joining tables or using certain operations that multiply the number of rows. This often leads to performance issues and excessive resource consumption. Some steps to address row explosion include:
- Optimizing data types and volume. Consider using more compact data types and filtering unnecessary columns or rows to improve query performance. For example:
SELECT column1, CAST(column2 AS VARCHAR(10))
FROM your_table
WHERE column3 = 'some-value';
- Optimize subqueries. Consider rewriting subqueries as joins or using CTEs (Common Table Expressions). For example:
WITH subquery AS (
SELECT column1
FROM table1
WHERE column2 = 'value'
)
SELECT column1, COUNT(*)
FROM subquery
JOIN table2 ON subquery.column1 = table2.column1
GROUP BY column1;
- Reduce returned rows using DISTINCT or GROUP BY clauses. This clause can be helpful when you don’t need duplicate rows (e.g. when you only need unique values). For example:
SELECT DISTINCT col1, col2
FROM large_fact_table
- Avoid Cartesian products. When a join condition is not specified or incorrect, Snowflake ends up using Cartesian products, which combine every row from both tables. To avoid this, make sure join conditions are properly defined. For example:
SELECT *
FROM table1
JOIN table2 ON table1.common_column = table2.common_column;
- Use appropriate aggregations. Instead of joining and then aggregating, aggregate before joining. For example:
SELECT t1.id,
t1.name,
agg.total
FROM table1 t1 JOIN (
SELECT table1_id,
SUM(amount) AS total
FROM table2
GROUP BY table1_id )
ON t.di = agg.table1_id;
- Use temporary tables. When you have complex queries with multiple joins or subqueries, you can use temporary tables to break down the query into smaller steps and store immediate results in temporary tables. For example:
-- Create temporary table
CREATE TEMPORARY TABLE temp_table AS
SELECT column1, column2
FROM table1
JOIN table2 ON table1.id = table2.id;
-- Use the temporary table in subsequent operations
SELECT column1, COUNT(column2)
FROM temp_table
GROUP BY column1;
6. Query pruning
Query pruning reduces the amount of data scanned during execution, using Snowflake micro-partitions to create tables that then determine which data are relevant, and which ones can be skipped. Snowflake also uses data clustering to group similar data together, which makes pruning more efficient and further reduces the data needing to be scanned.
Data clustering can occur during or after table creation. In the former case, you can define clustering kyes by appending a CLUSTER BY clause to CREATE TABLE:
CREATE TABLE <name> ... CLUSTER BY ( <expr1> [ , <expr2> ... ] )
Where each clustering key consists of one or more table columns/expressions, which can be of any data type, except GEOGRAPHY, VARIANT, OBJECT, or ARRAY. A clustering key can contain any of the following:
- Base columns
- Expressions on base columns
- Expressions on paths in VARIANT columns
For example:
-- cluster by base columns
CREATE OR REPLACE TABLE t1 (c1 DATE, c2 STRING, c3 NUMBER) CLUSTER BY (c1, c2);
-- cluster by expressions
CREATE OR REPLACE TABLE t2 (c1 timestamp, c2 STRING, c3 NUMBER) CLUSTER BY (TO_DATE(C1), substring(c2, 0, 10));
-- cluster by paths in variant columns
CREATE OR REPLACE TABLE T3 (t timestamp, v variant) cluster by (v:"Data":id::number);
At any time, you can add a clustering key to an existing table or change the existing clustering key for a table using ALTER TABLE:
ALTER TABLE <name> CLUSTER BY ( <expr1> [ , <expr2> ... ] )
For example:
-- cluster by base columns
ALTER TABLE t1 CLUSTER BY (c1, c3);
-- cluster by expressions
ALTER TABLE T2 CLUSTER BY (SUBSTRING(C2, 5, 15), TO_DATE(C1));
-- cluster by paths in variant columns
ALTER TABLE T3 CLUSTER BY (v:"Data":name::string, v:"Data":id::number);
7. Monitoring spillage
A critical component of Snowflake performance tuning is monitoring spillage. This occurs when warehouses can’t accommodate an operation in memory, and “spills” the data onto a local disk of a warehouse node, or to remote storage. Spillage significantly degrades individual query performance, and can significantly hurt the overall warehouse.
Solutions to spillage include several of the solutions mentioned above, including:
- Scaling warehouses up (then back down when the excessive demand has ended)
- Query pruning
- Mitigating row explosion
- Query optimization
- Reducing concurrency levels
- Leveraging temporary tables
So what do most data teams get wrong about Snowflake performance tuning?
While, yes, we wanted the information above to be helpful, we had a bit of an ulterior motive. Part of the reason we went into such detail was to show you just how involved Snowflake performance tuning can be.
Not only does it require significant manual labor to implement, but this isn’t something you can hand off to an intern or entry-level engineer. This brings us back to the problem mentioned in the introduction: in your effort to reduce costs through Snowflake performance tuning, you end up consuming more resources and canceling things out.
What’s more, warehouse optimization is an ongoing process. Continuously evolving workloads, new users, and applications that introduce diverse queries require constant attention and optimization to keep costs low without sacrificing performance.
So what’s the solution to this rock-and-hard-place? It’s simple: don’t rely on humans to implement these optimizations. Instead, let a robot handle it.
Keebo leverages advanced AI algorithms to constantly monitor dozens of parameters, adjusting them in real time to avoid over- or under-provisioning. In a dynamic cloud environment, Keebo works 24/7, making changes that human data teams will simply miss.
Want to learn more about how Keebo can automate Snowflake performance tuning? Schedule a demo of the product here.