Jash Shah
Jash Shah

Reputation: 2164

Unable to create a version in Cloud AI Platform using custom containers for prediction

Because of certain VPC restrictions I am forced to use custom containers for predictions for a model trained on Tensorflow. According to the documentation requirements I have created a HTTP server using Tensorflow Serving. The Dockerfile used to build the image is as follows:

FROM tensorflow/serving:2.3.0-gpu

# copy the model file
ENV MODEL_NAME=my_model
COPY my_model /models/my_model

Where my_model contains the saved_model inside a folder named 1/.

I have then pushed the container image to Artifact Registry and then created a Model. To create a Version I have selected Customer Container on the Cloud Console UI and and added the path to the Container Image. I have then mentioned the Prediction route and the Health route to be /v1/models/my_model:predict and have changed the Port to 8501. I have also selected the machine type to be a single compute node of type n1-standard-16 and 1 P100 GPU and kept scaling Auto scaling.

After clicking on Save I can see the Tensorflow Server starting and while viewing the logs we can see the following messages:

Successfully loaded servable version {name: my_model version: 1}

Running gRPC ModelServer at 0.0.0.0:8500

Exporting HTTP/REST API at:localhost:8501

NET_LOG: Entering the event loop

However after about 20-25 minutes the version creation just stops throwing out the following error:

Error: model server never became ready. Please validate that your model file or container configuration are valid.

I am unable to figure why this is happening. I am able to run the same docker image on my local machine and I am able to successfully get predictions by hitting the endpoint that is created: http://localhost:8501/v1/models/my_model:predict

Any help is this regard will be appreciated.

Upvotes: 1

Views: 1317

Answers (2)

Jash Shah
Jash Shah

Reputation: 2164

Answering this myself after working with the Google Cloud Support Team to figure out the error.

Turns out the port I was creating a Version on was conflicting with the Kubernetes deployment on Cloud AI Platform's side. So I changed the Dockerfile to the following and was able to successfully run Online Predictions on both Classic AI Platform and Unified AI Platform:

FROM tensorflow/serving:2.3.0-gpu

# Set where models should be stored in the container
ENV MODEL_BASE_PATH=/models
RUN mkdir -p ${MODEL_BASE_PATH}

# copy the model file
ENV MODEL_NAME=my_model
COPY my_model /models/my_model

EXPOSE 5000

EXPOSE 8080

CMD ["tensorflow_model_server", "--rest_api_port=8080", "--port=5000", "--model_name=my_model", "--model_base_path=/models/my_model"]

Upvotes: 2

klesouza
klesouza

Reputation: 71

Have you tried using a different health path? I believe /v1/models/my_model:predict uses HTTP POST, but health checks are usually using HTTP GET

You might need a GET endpoint for your health check path.

Edit: From the docs https://www.tensorflow.org/tfx/serving/api_rest you might be able to test just using /v1/models/my_model as your health endpoint

Upvotes: 0

Related Questions