Amboj
Amboj

Reputation: 148

Creating partitioned external table in bigquery

I wanted to create a external table in bigquery which loads data from google cloud storage. During creation of table from Web UI the option of Partitioning Type gets disabled. Is there any way i can create partitioned external table ? My data is already partitioned by date format on GCS Ex: /somepath/data/dt=2018-03-22

Upvotes: 5

Views: 6685

Answers (1)

Felipe Hoffa
Felipe Hoffa

Reputation: 59165

Federated tables in GCS automatically act as partitioned tables when you use the file name as a variable (_FILE_NAME).

For example, this view transforms the file name into a native date:

#standardSQL
CREATE VIEW `fh-bigquery.views.wikipedia_views_test_ddl`
AS SELECT
  PARSE_TIMESTAMP('%Y%m%d-%H%M%S', REGEXP_EXTRACT(_FILE_NAME, '[0-9]+-[0-9]+')) datehour
  , _FILE_NAME filename
  , line
FROM `fh-bigquery.views.wikipedia_views_gcs`

Later I can write queries like:

#standardSQL
SELECT *
FROM `fh-bigquery.views.wikipedia_views_test_ddl`
WHERE EXTRACT(YEAR FROM datehour)=2015
AND EXTRACT(MONTH FROM datehour)=10
AND EXTRACT(DAY FROM datehour)=21
AND EXTRACT(HOUR FROM datehour)=7

... and these queries will only open the files with names that match this pattern.

I wrote a whole story about this at https://medium.com/google-cloud/bigquery-lazy-data-loading-ddl-dml-partitions-and-half-a-trillion-wikipedia-pageviews-cd3eacd657b6.

Upvotes: 10

Related Questions