asd
asd

Reputation: 183

search optimization vs cluster key in snowflake

Can some explain when do we use search optimization and cluster key for table or do we use both ? I see that we are losing credits if we enable both of them?

Thanks, Sye

Upvotes: 1

Views: 679

Answers (3)

peterb
peterb

Reputation: 727

When you load a table into snowflake, it creates 'micropartitions' based on the order of the rows at load time. When a SQL statement is run, the where clause is used to prune the search space of which partitions need to be scanned.

A Cluster Key in Snowflake simply reorders the data by the cluster key, so that it is co-located within the same micropartitions. This can result in massive performance improvements if your queries frequently use the the cluster key in the where clause to filter the results.

Search optimization is for finding 1 or a small number of records based on using '=' in the where clause.

So if you have a table with Product_ID, Transaction_Date, Amount. Queries using 'Where Year(Transaction Date) >= 2017' would benefit from a cluster key on Transaction Date.

Queries using 'Where Product_ID = 111222333' would benefit from search optimization.

In either case, these are only needed of your table is large (think billions of rows). Otherwise, the native Snowflake micropartition approach will do a good job at optimization.

Upvotes: 1

Jeffrey Jacobs
Jeffrey Jacobs

Reputation: 332

Please don't call Cluster Key "partitioning". Although the effect is similar, they are two distinct operations with different meanings. I will be publishing an article on partitioning and pruning shortly.

Upvotes: 0

Gokhan Atil
Gokhan Atil

Reputation: 10059

The Search Optimization is used when you need to access small number of rows (point lookup queries), like when you access an OLTP database.

Cluster Key is for partitioning your data. It's generally good for any kind of workloads unless you need to read whole table.

If you don't need to access a specific row in your large table, you don't need Search optimization service.

If your table is not large, or if you ingest "ordered" data to your table, you don't need auto-clustering (cluster keys).

Upvotes: 1

Related Questions