Reputation: 121
I have the following requirement.
We have 1400 sharded tables in BQ based on specific functional grouping. Maximum of these individual sharded tables are also time-partitioned for DAY.
Smaller tables are not time-partitioned.
I am trying to create views dynamically in BigQuery by doing a UNION of all the sharded tables and writing a WHERE filter condition on _partitiontime.
But the problem is there could be smaller tables which are not time-partitioned, the query fails.
There are many more tables created on daily basis and I cant settle for a static exclusion solution since that would require manually maintaining a file with table names etc. (Apache Beam + BQ uses the data that is coming in and creates new sharded tables for newer functional groups without manual intervention.)
I am trying to exclude the non time-partitioned tables by using bq
utility to connect to BQ more than 1000 times to check if a table is time-partitioned.
bq show --format=prettyjson
and check the timepartitioning
type
field.
This is very slow, more than 30 minutes.
I have tried the __TABLES_SUMMARY__
, but it does not have the partitioning info.
I have also checked
SELECT partition_id from [mydataset.table1$__PARTITIONS_SUMMARY__];
, but this works if the table is already partitioned.
Upvotes: 12
Views: 12443
Reputation: 151
You can write the below query in standard SQL in BQ to get the information
SELECT *
FROM `project_id.dataset_name.INFORMATION_SCHEMA.COLUMNS`
WHERE TABLE_NAME = Table_Name
AND is_partitioning_column = "YES"
Upvotes: 15
Reputation: 1
You can write a function that runs a query like this with try and except, the query will error if the table is not partitioned.
SELECT partition_id FROM [%s.%s$__PARTITIONS_SUMMARY__]' % (dataset_id, table_id)
Upvotes: 0
Reputation: 1122
Unfortunately, there is no way to determine this dynamically.
As a workaround, I would suggest keeping your partitioned tables and your non-partitioned tables in distinct datasets OR including information about whether the table is partitioned in its name.
You could also maintain a list of table prefixes that are known to be partitioned (or non-partitioned, if that is easier to maintain).
You can also open a feature request on the public issue tracker for BigQuery: https://issuetracker.google.com/issues/new?component=187149&template=0
Upvotes: 1