David542
David542

Reputation: 110093

Partitioned tables in BigQuery

I was wondering what the usage of using a partitioned table in BigQuery is. It seems most of the queries seem to take about the same time to finish regardless of size (ignoring extremes, I'm generalizing), is this mainly a matter of using it to reduce costs on the bytes processed, or what is the main use case of partitioning tables in BQ?

https://cloud.google.com/bigquery/docs/creating-column-partitions

Upvotes: 0

Views: 671

Answers (1)

Pentium10
Pentium10

Reputation: 207830

There are multiple benefits, mainly costs.

  • by writing a query to read only eg: 7 days of partitions instead of 7 years you have lower costs
  • partitions you don't touch for older than 90 days are at lower costs
  • you can clearly reload a day's data much more easier than having to work around
  • you are still recommended to use YEARly tables eg mytable_2018, but you are no longer required to have daily tables eg: mytable_20180101, this further leads to have simpler queries, also no longer a problem to read more than 1000 tables (which is a hard limit).
  • when you modify schema, you need to modify a few tables, you no longer need to script alters on thousands of table
  • this also means it's lover bytes processed and in the cloud platform can be better optimized and needs fewer resources
  • by reorganizing data into partitioned tables the query times will benefit in the future. As customers will move data, the cloud engineering team will optimize the service for better usage.
  • you see clear cost wise benefits if your existing data is at least a couple of terabytes.

Upvotes: 3

Related Questions