Reputation: 361
I have a Google Cloud bucket with multiple folders, each containing CSVs.
I want to create eventually one BigQuery table, containing all the CSVs, but add a column stating the original folder. I thought to first create a BigQuery table for each such folder and then union them.
Is there a quick way to create multiple BigQuery tables, one for each subfolder under a Google Cloud bucket? Plus - if there's an easy way to automatically add the folder name as a new column - it could be great!
Upvotes: 2
Views: 2350
Reputation: 59175
I wrote a blog post about this:
The trick is to create a federated query in BigQuery pointing to a glob of your files (gs://bucket/folder/*
), which gives you the pseudo column _FILE_NAME
:
SELECT *, _FILE_NAME fn
FROM `fh-bigquery.views.wikipedia_views_gcs`
LIMIT 10
Ta-da!
Upvotes: 2