micropartition statistics snowflake

Question

It's very well known that micro-partitions within snowflake architecture power the optimizer query approach to render data results faster when users needs to know data per column that points out:

a specific range of data
statistics information (MAX, MIN , COUNT)

However, checking the documentation the micro-partitions also keep information about the number of distinct values. I have been trying to test when the query optimizer can come in the hood to avoid deploys a computing layer and then render data quick and without need of computing tasks.

I tried the MAX, MIN, COUNT and for these the results , the execution was render without any computing layer and in a very decent time. However I tried to execute a COUNT DISTINCT but here I notice the computing layer was deployed before show up results:

So, the way how the micro-partitions benefit the query optimizer is keep summarize data available but still when the query demands distinct counts or AVG it's requiered a computing operation?

thanks.

micropartition statistics snowflake

Answers (1)

Related Questions