truthordare
truthordare

Reputation: 33

How do I keep a long-running background task from stopping on Google Cloud Run?

I have a FastAPI endpoint which starts some long-running background tasks:

background_tasks.add_task(fn())

These tasks can take up to 10 hours to complete.

I am running this inside a Docker container using gunicorn:

# Start the application
CMD exec gunicorn --bind :$PORT --workers 1 -k uvicorn.workers.UvicornWorker --threads 8 --timeout 0 main:app

When I start the container and hit the endpoint from Postman on my machine, the tasks start and when left running, they complete fine.

I'm now using Google Cloud Run to host my container. When I call the endpoint (XXX.run.app) the tasks also seem to start based upon the logs, but they seem to stop and later on, the service will restart.

I've tried increasing the request timeout to the maximum but this didn't help. Is there something else I can try?

Upvotes: 2

Views: 928

Answers (2)

Connor
Connor

Reputation: 423

Just in case anyone finds themselves here still looking for a solution. I've also got a fastapi setup, though I don't need it to stay alive for 10hrs just 5-10 minutes.

The following configuration of always having 1 instance and keeping the cpu allocated seems to work for me with some caveats. It will kill your process if the container scales down or you deploy on top of it, so still less than ideal. enter image description here

Upvotes: 1

guillaume blaquiere
guillaume blaquiere

Reputation: 75705

Your design is just not optimal for your expectation. You mix "real time API" with FastAPI, and long running batches in backgrounds.

2 Behaviors, 2 type of services -> Separation of concern.

  • I recommend you to use Cloud Run for your FastAPI backend.
  • And then use a batch compliant service for your batches (Cloud Batches, Cloud Run Jobs (but more than 1h timeout jobs are only in private preview for now), Pod Jobs on GKE (autopilot or not),...)
  • To synchronize the 2, PubSub and Cloud Functions (or Cloud Run again) are perfect. With FastAPI post a message in PubSub. Catch it with Cloud Functions (or Cloud Run) which runs the job process and ack the message.

Upvotes: 2

Related Questions