Lucas Mähn
Lucas Mähn

Reputation: 906

Is there a minimal time after which a cloud run instance isn't scaled down when using always allocated CPU?

I currently have a typical Cloud Run service implemented and want to extend it with some asynchronous functionality, which is executed in response to an incoming HTTP request. These asynchronous tasks would take no longer than 5-10 minutes.

I am now wondering if cloud run services with the "always allocated cpu"-option enabled can guarantee a 15 min window of allocated cpu time after the response of the last request has been send. I understand that an instance, which has not received a request in more than 15min is subject to termination. But is it also the other way around?

I found the following paragraph on the cloud run documentation:

Even if CPU is always allocated, Cloud Run autoscaling is still in effect, and may terminate container instances if they aren't needed to handle incoming traffic. An instance will never stay idle for more than 15 minutes after processing a request (unless it is kept active using min instances).

(https://cloud.google.com/blog/products/serverless/cloud-run-gets-always-on-cpu-allocation)

This is the only time though that the 15 minutes time interval is mentioned in this article and also the gantt chart in the article does not show any fixed time of guaranteed cpu-allocation time after the last sent response.

Is there some guaranteed cpu-time interval after a request?

Upvotes: 6

Views: 4638

Answers (2)

Robert G
Robert G

Reputation: 2065

This is not possible as per documentation that you provided (same with CPU allocation (services) documentation).

Note that even if CPU is always allocated, Cloud Run autoscaling is still in effect, and may terminate container instances if they aren't needed to handle incoming traffic. An instance will never stay idle for more than 15 minutes after processing a request unless it is kept active using minimum instances.

One way of keeping idle instances permanently available is by setting your min-instance with value more than 1 however this would incur cost even if the service is not actively handling any requests.

You can check this documentation about container instance autoscaling for additional information.

Upvotes: 1

Filip Dupanović
Filip Dupanović

Reputation: 33690

My take from the container contract was that it's best to confine yourself to the request/response flow, but have mostly been practicing this because it lends itself to having an easier time tracing requests.

While there isn't anything clear to point out that it's not permissive to use the allotted idle time for out of bound processing like you intended, perhaps it would be prudent to either use Cloud Tasks or the new Cloud Run Job workload, if that's an option.

Upvotes: 1

Related Questions