Anthony Mattas
Anthony Mattas

Reputation: 311

How does Google Cloud Run handle Async Calls?

How does Google Cloud Run handle Flask apps with long-running asynchronous code? This scenario doesn't seem to be well documented.

Specifically I want to run something like the below code. What will happen? will it get terminated in-flight if no requests are coming in? Will it run through to completeion and then scale to zero?

@app.route('/ImportVendorProducts')
def import_vendor_products():
    try:
        t = Thread(target=import_wtc_products)
        t.start()
        response = Response(response=f"Started import_wtc_products", status=202)
    except:
        response = Response(response=f"import_wtc_products failed", status=500)
    return response

Upvotes: 0

Views: 3697

Answers (3)

Lyubomir
Lyubomir

Reputation: 311

Google says HERE:

  • If CPU is allocated only during request processing:

If you need to set your service to allocate CPU only during request processing, when the Cloud Run service finishes handling a request, the container instance's access to CPU will be disabled or severely limited. Review your code to make sure all asynchronous operations finish before you deliver your response.

So:

If you want to support background activities in your Cloud Run service, set your Cloud Run service CPU to be always allocated so you can run background activities outside of requests and still have CPU access.

Upvotes: 1

Donnald Cucharo
Donnald Cucharo

Reputation: 4126

Serverless products (Cloud Run, Cloud Functions, App Engine Standard) track available capacity using a heuristic that accounts for CPU, memory, and requests. It also marks apps idle if there are no requests. Background work causes a situation where CPU is being consumed (and the app at large is being utilized), but possibly no requests are reaching the instance. This misalignment in "app busy-ness" can cause poor decisions about the number of instances to create or kill, resulting in behaviors like:

  • The app has too many instances scaled up (more than concurrent requests warrants), resulting in higher billing. This happens because the system creates more instances than needed due to high CPU load, even when there are no requests.
  • The app's long-running jobs (the background work) are prematurely killed by automatic downscaling.

For short running tasks, you can utilize Cloud Tasks along with Cloud Run for asynchronous work as it is designed to work outside of a user or service-to-service request.

Additional reference: https://cloud.google.com/run/docs/tips/general#avoiding_background_activities

Upvotes: 2

Pentium10
Pentium10

Reputation: 208002

Your container will have CPU allocated only during the HTTP request time.

So at the moment you returned and served the response, the CPU is taken away and the background call will not execute.

Upvotes: 2

Related Questions