Reputation: 681
I'm trying to deploy a TorchServe instance on Google Vertex AI platform but as per their documentation (https://cloud.google.com/vertex-ai/docs/predictions/custom-container-requirements#response_requirements), it requires the responses to be of the following shape:
{
"predictions": PREDICTIONS
}
Where PREDICTIONS is an array of JSON values representing the predictions that your container has generated.
Unfortunately, when I try to return such a shape in the postprocess()
method of my custom handler, as such:
def postprocess(self, data):
return {
"predictions": data
}
TorchServe returns:
{
"code": 503,
"type": "InternalServerException",
"message": "Invalid model predict output"
}
Please note that data
is a list of lists, for example: [[1, 2, 1], [2, 3, 3]]. (Basically, I am generating embeddings from sentences)
Now if I simply return data
(and not a Python dictionary), it works with TorchServe but when I deploy the container on Vertex AI, it returns the following error: ModelNotFoundException
. I assumed Vertex AI throws this error since the return shape does not match what's expected (c.f. documentation).
Did anybody successfully manage to deploy a TorchServe instance with custom handler on Vertex AI?
Upvotes: 0
Views: 1435