Raghav L
Raghav L

Reputation: 11

Is partition column in where clause not mandatory in case we are filtering using another column which is of same partition type(day) in BigQuery?

I have a table partitioned based on month(Month start date) column in BigQuery, I can observe that when I am trying to filter records based on another column last_updated(date) column and not specifying the column month on which the partition happened still BigQuery is fetching only records for the partitions where the last_updated range is coming under

  1. Is it not mandatory to use partitioned column in case we are filtering based on similar column?

Schema : name, status, last_updated, month, quarter, year. Table is partitioned on month column and contains 11 records, 5 for June. I can see in execution details input rows selected as 5 and output as 1

SELECT 
*
FROM 
   <dataset>.test_part 
WHERE  last_updated = '2020-06-16

Even though month column is not explicitly mentioned the partition is being selected

Upvotes: 1

Views: 1270

Answers (1)

When you create a partitioned table, it is divided in different segments, called partitions that make you easier to manage your data, because when you query the data using the partitioned column you only need to check that partition and no other. However, if you use another column in the where clause there is no guarantee that dremel will process more data than it would using the partitioned column in the where clause.

You can refer to this document to better understand how it works [1].

[1] https://cloud.google.com/bigquery/docs/partitioned-tables

Upvotes: 0

Related Questions