harry
harry

Reputation: 135

batch predictions in GCP Vertex AI

While trying out batch predictions in GCP Vertex AI for an AutoML model, the batch prediction results span over several files(which is not convenient from a user perspective). If it would have been a single batch prediction result file i.e. covering all the records in a single file, it would make the procedure much more simple.

For instance, I had 5585 records in my input dataset file. The batch prediction results comprise of 21 files wherein each file has records in the range of 200-300, thus, covering 5585 records altogether.

Upvotes: 2

Views: 2442

Answers (1)

Sandeep Mohanty
Sandeep Mohanty

Reputation: 1552

Batch predictions on an image, text,video,tabular AutoML model, runs the jobs using distributed processing which means the data is distributed among an arbitrary cluster of virtual machines and is processed in an unpredictable order because of which you will get the prediction results stored across various files in Cloud Storage. Since the batch prediction output files are not generated with the same order as an input file, a feature request has been raised and you can track the update on this request from this link.

We cannot provide an ETA at this moment but you can follow the progress in the issue tracker and you can ‘STAR’ the issue to receive automatic updates and give it traction by referring to this link.

However, if you are doing batch prediction for a tabular AutoML model, there you have the option to choose the BigQuery as storage where all the prediction output will be stored in a single table and then you can export the table data to a single CSV file.

Upvotes: 2

Related Questions