Can you trigger autoscaling in Google App Engine based on Cloud Pub/Sub queue?

Question

I know you can configure autoscaling based on queue size when scaling a Compute Engine instance group, but I'm unsure of how I could replicate this behavior in a flexible App Engine based environment. Is this possible?

I want to be able to decouple my frontend service from my backend and allow them to work asynchronously, but I'm not sure how to scale the backend with pub/sub queue size that can get very big. The only scaling options I see in autoscaling section of app.yaml have to do with CPU utilization.

LundinCast · Accepted Answer

App Engine Flexible environment currently only supports autoscaling based on a target cpu utilization (see doc for scaling settings).

Also note that you can't set the actual number of running instances with autoscaling but only the "max_num_instances" value. You can update it to a higher value programmatically via the App Engine Admin API's apps.services.versions.patch method, but the autoscaler will still decide to actually spawn new instances based on cpu utilization only.

The best option as you mentioned would be to allow concurrent requests and multi-threading in order to use each instance at its maximum potential. You could then tweak the cpu's target_utilization value to have new instances spawned if needed.

Can you trigger autoscaling in Google App Engine based on Cloud Pub/Sub queue?

Answers (1)

Related Questions