willy
willy

Reputation: 99

Specifying BigQuery dataset/table's name on AutoML Batch Prediction results

Basically I want to specify BigQuery dataset/table's name on the batch prediction result of AutoML.

But looking at the following documentation, the dataset and the table's name are automatically generated. And a new dataset will be created for each batch prediction executed.

https://cloud.google.com/automl-tables/docs/predict-batch#bq-results

Looking at the following documentation, only projectId can be specified on the BigQuery destination:

https://cloud.google.com/automl/docs/reference/rest/v1beta1/BigQueryDestination

I plan to do the batch prediction automatically on a weekly basis. To make the prediction results "cleaner", I want to group all the prediction results into one dataset, instead of having a separate dataset for each batch prediction.

Is there any way to get it done via the provided API?

Upvotes: 5

Views: 1001

Answers (1)

If it's not documented there is no way to do it via the API, then, if you want to create all output in the same dataset, you can send the result to a bucket directory [1], this way it will create multiple CSV files [2] in your Google Cloud Storage Bucket.

Therefore, now you have to create a way to read the new files from this bucket and create a new table in the desired dataset each time you make a batch prediction. Here is a documentation that shows you how to create a new table from a CSV file in Google Cloud Storage [3].

The other way is to make a transfer from the new created table to the desired dataset [4], but you have to do it evary time a new table is created.

[1] https://cloud.google.com/automl-tables/docs/predict-batch#using_csv_files_in

[2] https://cloud.google.com/automl-tables/docs/predict-batch#csv-results

[3] https://cloud.google.com/bigquery/external-data-cloud-storage#creating_and_querying_a_permanent_external_table

[4] https://cloud.google.com/bigquery/docs/managing-tables#copying_a_single_source_table

Upvotes: 2

Related Questions