AIK DO
AIK DO

Reputation: 368

BigQuery to GCS JSON

I wanted to be able to store Bigquery results as json files in Google Cloud Storage. I could not find an OOB way of doing this so what I had to do was

  1. Run query against Bigquery and store results in permanent tables. I use a random guid to name the permanent table.
  2. Read data from bigquery, convert it to json in my server side code and upload json data to GCS.
  3. Delete permanent table.
  4. Return the json file url in GCS to front end application.

While this works there are some issues with this.

A. I do not believe I am making use of BigQuery's caching by making use of my own permanent tables. Can someone confirm this? B. Step 2 will be a performance bottleneck. To pull data out of GCP to do JSON conversion to reupload into GCP just feels wrong. A better approach would be to use some cloud native serverless function or some other GCP data workflow type service to do this step that gets triggered upon creation of a new table in the dataset. What do you think is the best way to achieve this step? C. Is there really no way to do this without using permanent tables?

Any help appreciated. Thanks.

Upvotes: 0

Views: 2105

Answers (1)

Yun Zhang
Yun Zhang

Reputation: 5518

With persistent table, your are able to leverage Bigquery Data Exporting to export the table in JSON format to GCS. It has no cost, comparing with you reading the table from your server side.

Right now, there is indeed a way to avoid creating permanent table. Because every query result is actually a temporary table already. If you go to "Job Information" you can find the full name of the temp table, which can be used in Data Exporting to be exproted as a JSON to GCS. However, this is way more complicated than just create a persistent table and delete it afterwards.

Upvotes: 1

Related Questions