Reputation: 1
Reference:
Has anyone used Snowflake search optimization and gained benefits over cluster keys? Please share any use cases, cost vs performance as well.
Appreciate the insights
Upvotes: 0
Views: 164
Reputation: 56
Snowflake Search Optimisation is designed to find the “Needle in the haystack” whereas Data Clustering is designed to reduce the data returned, not find a tiny number of rows.
I have (at least) two use cases where Search Optimisation service and Clustering worked.
I had a 1.3 TB table with billions of rows and 80% of queries fetched data for the current month with 50% during the current week. Of he remaining 20%, a majority fetched data for a month or so, but up to 15 years ago.
A cluster key kept the data physically sorted by transaction date and queries ran in seconds for 90% of cases. However, clustering only worked well because most queries included a filter like:
where TRANSACTION_DATE between :START and :END;
Typical queries returned thousands or even hundreds of thousands of rows from billions.
Another table (again TB sized) had queries using EQUALITY searches against a combination of keys. Unlike the previous use-case, analysts searched for rows across the entire history (all 20 years of billions of rows), and a typical WHERE clause looked like:
where AGE_GROUP = ‘GEN-Z’
and INCOME_BRACKET = :INCOME
and REGION = :REGION
and GENDER = ‘MALE’
and M_STATUS in (‘MARRIED’, ‘COHABITING’)
The key points were:
Use case 1 (clustering) could return 100,000 rows, but because the data was both queried and (more importantly) stored by DATE, most queries fetched about 10 micro partitions. All these were “clustered together”.
Use case 2 Search Optimisation service, typically returned < 1,000 rows (from 70,000 micro partitions), but these were from across the entire table history.
The more selective the query, the faster the response.
I’ve written a blog post about my experience: https://www.analytics.today/blog/snowflake-search-optimization-service-best-practices
Hope this helps.
Upvotes: 0
Reputation: 466
In general, Search Optimisation Service (SOS) would be more beneficial over Clustering for point lookup queries, the type of queries that retrieves 1 or a few rows from a very large table using equality or IN filter condition.
Since you can only have one cluster key in a table, SOS can also help optimise searches from non-cluster-key columns in a clustered table.
However unlike Clustering, SOS adds storage cost which holds search access path data for each table with SOS enabled
Upvotes: 1