Target tracking autoscaling scaling out excessively

Question

I have set a target tracking in AWS application autoscaling with min 3 instances and max 32. The target value is set to 80 of CPU utilization. Once we reach the target value for 3 minutes, the cloudwatch triggers scale out. However, the newly created instances of our service are not warmed up well, so it distrupts the average of CPU usage with more spikes, so the target tracking scales out even more, creating even more spikes and so on. The cooldown doesn't seem to help here as the scaling will occur anyway if the spikes become bigger than previous spikes.

Moreover, if some instance goes down and needs to be recreated it creates spikes in the CPU usage too, so it sometimes breaks the chain of scaling in (in result we need to wait another 15min to scale in).

The problem here is that the CPU usage mostly comes from processing SQS events instead of http requests, so slow start mode doesn't help here.

I was thinking about using metric around number of SQS messages for autoscaling, but firstly it's not available right away (according to this) and secondly it's hard to predict what amount of messages will map to what cpu usage. I found in EC2 autoscaling that there is a warmup period which doesn't send metrics to cloudwatch (source) which sounds awesome, but it seems there is no such funcionality in application autoscaling. Is there a way to mitigate the spikes / excessive scaling out using an aws out of the box solution?

Target tracking autoscaling scaling out excessively

Answers (0)

Related Questions