rodrigo-silveira
rodrigo-silveira

Reputation: 13068

Google Cloud Run Starts Container Before Healthcheck is Healthy

I have an image that has a relatively lengthy startup time of ~5 seconds. In other words, the Flask server is up and running, but I load some data into global variables, so the server is not really operational at this point. If I ping my Google Cloud Run endpoint during this time, the connection will timeout with

upstream request timeout

To avoid this, I added a docker healthcheck that calls an endpoint in my server. This http request has a timeout of 2 seconds. If it times out, it means that the server is still loading those global files, and the endpoint is not ready to receive requests just yet. This works fine in development, but not in Cloud Run. Cloud Run starts serving traffic to my server before it's done loading - and subsequently, before the container HEALTHCHECK status is actually "healthy".

My question

How can I delay Cloud Run from delivering traffic to my container until it's fully setup?

Edit > answer

In my case (using Python + Gunicorn) I was able to solve this using the "application factory" pattern. That is, start Gunicorn with

$ gunicorn 'test:create_app()'

Where the function create_app() returns the Flask application.

My hypothesis as to why this works is because until that function returns, Gunicorn is not yet listening on the port it binds to, and Cloud Run won't start driving traffic to your new running container until that's the case.

Upvotes: 2

Views: 2269

Answers (3)

txomon
txomon

Reputation: 662

We hit a "feature" where cloud run would, for GRPC services, declare new revisions as healthy once a single instance of a revision is healthy. This bug surfaces when you have really slow startup times (15-20s).

Google didn't accept this as a bug, but the evidence provided by them is that they would enqueue requests in the nodes waiting for the new instances of the revisions to start instead of routing them to old instances, causing brief downtimes between deploys.

Another bug that you may encounter is that if you have some processing to be done before your instance gets healthy, if you enable cpu-boost, Cloud Run will understand that everything has gone south and limit traffic to 1 request per instance, causing your services to overscale for no reason.

Relevant bug https://issuetracker.google.com/issues/377764060?pli=1

Upvotes: 0

neoakris
neoakris

Reputation: 5075

Note CloudRun now supports liveness and startup probes.
per https://cloud.google.com/run/docs/configuring/healthchecks

I was very surprised to learn that CR used to not support standard Kubernetes probes, but it seems that after a recent update. Not sure when it happened but at the time of this post (Oct 3, 2022) CloudRun health checks are considered to be in "Preview". Readiness probes still aren't a thing, but startup probes are now allowed so the original cool hack solution can now be replaced by standard startup probes.

Here's a walkthrough of how to implement / test startup probe: https://stackoverflow.com/a/73942357/2548914

Upvotes: 0

sllopis
sllopis

Reputation: 2368

rodrigo-silveira's solution:

In my case (using Python + Gunicorn) I was able to solve this using the "application factory" pattern. That is, start Gunicorn with

$ gunicorn 'test:create_app()' Where the function create_app() returns the Flask application.

My hypothesis as to why this works is because until that function returns, Gunicorn is not yet listening on the port it binds to, and Cloud Run won't start driving traffic to your new running container until that's the case.

Upvotes: 1

Related Questions