Reputation: 148
I wanted to create a external table in bigquery which loads data from google cloud storage. During creation of table from Web UI the option of Partitioning Type gets disabled. Is there any way i can create partitioned external table ? My data is already partitioned by date format on GCS Ex: /somepath/data/dt=2018-03-22
Upvotes: 5
Views: 6685
Reputation: 59165
Federated tables in GCS automatically act as partitioned tables when you use the file name as a variable (_FILE_NAME
).
For example, this view transforms the file name into a native date:
#standardSQL
CREATE VIEW `fh-bigquery.views.wikipedia_views_test_ddl`
AS SELECT
PARSE_TIMESTAMP('%Y%m%d-%H%M%S', REGEXP_EXTRACT(_FILE_NAME, '[0-9]+-[0-9]+')) datehour
, _FILE_NAME filename
, line
FROM `fh-bigquery.views.wikipedia_views_gcs`
Later I can write queries like:
#standardSQL
SELECT *
FROM `fh-bigquery.views.wikipedia_views_test_ddl`
WHERE EXTRACT(YEAR FROM datehour)=2015
AND EXTRACT(MONTH FROM datehour)=10
AND EXTRACT(DAY FROM datehour)=21
AND EXTRACT(HOUR FROM datehour)=7
... and these queries will only open the files with names that match this pattern.
I wrote a whole story about this at https://medium.com/google-cloud/bigquery-lazy-data-loading-ddl-dml-partitions-and-half-a-trillion-wikipedia-pageviews-cd3eacd657b6.
Upvotes: 10