502/Timeout in vertex AI custom fastapi/uvicorn prediction container

Question

I am receiving a 'Error: 14 UNAVAILABLE: 502:Bad Gateway' timeout when calling my custom container from both the node and python client for predictions in vertex AI. The vertex prediction endpoint does work for short predictions. The prediction also does complete, I just get a preemptive response.

In the node client I was originally getting a '4 DEADLINE EXCEEDED', but set the call option timeout higher.

Now I just get a 'Error: 14 UNAVAILABLE: 502:Bad Gateway' from both clients. This is very frustrating, why is it timing out? Where can it be changed?

The most IMPORTANT thing to note is that the prediction does finish, it has to upload a file to google cloud storage and it does do that, and the logs show that the endpoint runs for as long as it needs to, but I am getting a preemptive 502 from Vertex which ruins my workflow for long running predictions. So, the question is, why would I be getting a 502? I am assuming it comes from some internal timeout in GCP.

Another note, I do have a health endpoint.

Help much appreciated.

502/Timeout in vertex AI custom fastapi/uvicorn prediction container

Answers (1)

Related Questions