vego
vego

Reputation: 1059

How does MIG autoscale when autoscale policy is set to Target HTTP load balancing utilization?

I'm learning load balancer and managed instance group auto scaling. I do not understand how does MIG autoscales when using HTTP load balancing utilization:

So, in MIG autoscale setting, I set Target HTTP load balancing utilization to 10%:

enter image description here

And in setting external HTTP load balancer: I have following two options:

utilization:

enter image description here

rate:

enter image description here

I can understand CPU based MIG autoscale, if the average CPU usage is greater than the number I inputed, then MIG will add more VMs to lower the number. It's very simple and straightforward.

But I do not know when will MIG autoscale when using HTTP load balancing utilization?

Upvotes: 2

Views: 421

Answers (1)

Wojtek_B
Wojtek_B

Reputation: 4443

GCP Load Balancing offers three types of autoscaling:

You can choose to scale using the following policies:

  • Average CPU utilization.
  • HTTP load balancing serving capacity, which can be based on either utilization or requests per second.
  • Cloud Monitoring metrics (not supported by regional autoscalers)

First one as you said yourself is pretty self-explanatory.

And this is what the official documentaiton says about Requests per second (RPS) based autoscaling:

With RATE, you must specify a target number of requests per second on a per-instance basis or a per-group basis. (Only zonal instance groups support specifying a maximum rate for the whole group.

But there is a limitation to the RPS based autoscaling:

Autoscaling does not work with maximum requests per group because this setting is independent of the number of instances in the instance group. The load balancer continuously sends the maximum number of requests per group to the instance group, regardless of how many instances are in the group.

For example, if you set the backend to handle 100 maximum requests per group per second, the load balancer sends 100 requests per second to the group, whether the group has two instances or 100 instances. Because this value cannot be adjusted, autoscaling does not work with a load balancing configuration that uses the maximum number of requests per second per group.

You may also find useful to have a look at the types of GCP load balancing supported by in various scenarios.

This document also describes when it's best not to use some types of load balancing.

Upvotes: 0

Related Questions