Reputation: 11
I have a table partitioned based on month(Month start date) column in BigQuery, I can observe that when I am trying to filter records based on another column last_updated(date) column and not specifying the column month on which the partition happened still BigQuery is fetching only records for the partitions where the last_updated range is coming under
Schema : name, status, last_updated, month, quarter, year. Table is partitioned on month column and contains 11 records, 5 for June. I can see in execution details input rows selected as 5 and output as 1
SELECT
*
FROM
<dataset>.test_part
WHERE last_updated = '2020-06-16
Even though month column is not explicitly mentioned the partition is being selected
Upvotes: 1
Views: 1270
Reputation: 416
When you create a partitioned table, it is divided in different segments, called partitions that make you easier to manage your data, because when you query the data using the partitioned column you only need to check that partition and no other. However, if you use another column in the where clause there is no guarantee that dremel will process more data than it would using the partitioned column in the where clause.
You can refer to this document to better understand how it works [1].
[1] https://cloud.google.com/bigquery/docs/partitioned-tables
Upvotes: 0