AWS - Autoscaling not launching/killing instances as specified

Question

I'm testing the AWS autoscaling. I've created a simple elastic beanstalk and edited the scaling settings. Here's a screenshot of the auto scaling group scaling policies:

As you can see, I'm trying to get the group to have between 5 and 10 healthy instances in permanence. The max and min number of instances have been set to 20 and 2.

I only get 2 instances (the minimum) running.

The alarms are defined as:

ping-server-too-healthy
ping-server-not-healthy-enough

The Load Balancer's alarms seems to be working correctly:

But the scaling group is not booting new instances. I've tried settings the alarms the other way around (I'm not sure they are triggered when they go from true to false or from false to true) and that leads to 20 instances (the maximum)

John Rotenstein · Accepted Answer

This is not the way that you should be using Auto Scaling.

When Elastic Beanstalk creates a "Load balancing, auto scaling" environment, it creates the Auto Scaling group for you. As part of these configurations, you can specify the minimum and maximum number of instances to launch in the Auto Scaling group:

Elastic Beanstalk auto scaling configuration

The Auto Scaling group will then keep the current Desired Capacity of instances within the minimum and maximum. If an instance fails (definition below), Auto Scaling will automatically replace that instance with another one to maintain the Desired Capacity.

The Scaling Policies are then used to adjust the Desired Capacity. Scaling policies should use some measure of "load" to determine when to add or remove instances, such as CPU Utilization or the size of an Amazon SQS queue. The intention is to add additional servers when more capacity is required, and remove servers when there is too much capacity.

The HealthyHostCount metric indicates how many servers have passed the Elastic Load Balancing health check. If an instance fails the health check, the Load Balancer stops sending it requests, but keeps performing the health check. If the instance becomes healthy again, the Load Balancer will resume sending requests to that server. The Elastic Load Balancing health check can be configured to check a particular page on the server to confirm that the application is healthy.

When Auto Scaling performs a health check, it is merely checking the the status of the virtualization environment, in the same way that the EC2 management console shows 2/2 status checks. However, it is possible to configure Auto Scaling to use the Elastic Load Balancing heath check. This way, Auto Scaling can be notified that the application is unhealthy and can automatically replace a failed instance (or an instance with a failed application).

You stated that your goal is "to get the group to have between 5 and 10 healthy instances in permanence". This is the job of Auto Scaling, especially when it has been configured to use Elastic Load Balancing health checks. The job of a scaling policy is to determine when to add/remove instances based upon workload. Scaling policies should not be used as a means of replacing unhealthy instances.

So, I recommend:

Configure a Health Check in your Load Balancer that accurately checks the health of your application
Turn on ELB health checks in your Auto Scaling group (which will ensure that instances with unhealthy applications will be replaced)
Use Scaling Policies to add/remove instances based upon workload rather than based on health
Trust the system. It works!

AWS - Autoscaling not launching/killing instances as specified

Answers (1)

Related Questions