I've been given a AWS environment to look after and it runs ECS on EC2 instances and has scaling configured using ECS Memory Reservation. The system was originally running before Cluster Autoscaling was made generally available so it's just using a cloudwatch metric to scale out and scale in. As far as I can work out it is following a basic AWS design. The EC2 has an autoscaling group and allows scale from 1 to 5 instances with 1 being the desired state. There is 1 cluster service running with 6 tasks configured. 5 of those tasks are configured to run up to 2 copies of the task maximum and 1 the desired, the other is set to maximum of 1. The tasks have MemoryReservation (soft limit) figures configured but not Memory (hard limit). The tasks are primarily running Java. The highest memory reservation is set at about 200MB and most are around this figure. The scale out rule is based on MemoryReservation at 85%. Docker stats shows most of the tasks are running about 300MB and some exceed 600MB. The instance size has 4GB of RAM. If the maximum reservation is 2GB, even if the tasks are consuming more like 3GB in reality, am I right in believing that the scale out rule will NEVER be invoked because 2GB is 50% of available RAM? Do I need to increase the memory reservations to something more realistic? Also if it is only running a single EC2 instance am I right in thinking even if I increased the MemoryReservation figures to something more realistic, just because there's no theoretical room to start another task it won't spin up a second EC2 instance automatically? Just picked this up from different articles I've been reading when searching. Thanks

amazon-web-servicesamazon-ec2amazon-ecsautoscaling

Reputation: 21

AWS ECS Scaling based on memoryreservation

I've been given a AWS environment to look after and it runs ECS on EC2 instances and has scaling configured using ECS Memory Reservation. The system was originally running before Cluster Autoscaling was made generally available so it's just using a cloudwatch metric to scale out and scale in. As far as I can work out it is following a basic AWS design.

The EC2 has an autoscaling group and allows scale from 1 to 5 instances with 1 being the desired state.
There is 1 cluster service running with 6 tasks configured.
5 of those tasks are configured to run up to 2 copies of the task maximum and 1 the desired, the other is set to maximum of 1.
The tasks have MemoryReservation (soft limit) figures configured but not Memory (hard limit).
The tasks are primarily running Java.
The highest memory reservation is set at about 200MB and most are around this figure.
The scale out rule is based on MemoryReservation at 85%.
Docker stats shows most of the tasks are running about 300MB and some exceed 600MB.
The instance size has 4GB of RAM.

If the maximum reservation is 2GB, even if the tasks are consuming more like 3GB in reality, am I right in believing that the scale out rule will NEVER be invoked because 2GB is 50% of available RAM? Do I need to increase the memory reservations to something more realistic?

Also if it is only running a single EC2 instance am I right in thinking even if I increased the MemoryReservation figures to something more realistic, just because there's no theoretical room to start another task it won't spin up a second EC2 instance automatically? Just picked this up from different articles I've been reading when searching.

Thanks

Upvotes: 1

Answers (2)

Efren

Reputation: 4917

After the update of Capacity Providers in May 2022, Capacity Providers still have a gap to fill in Memory scaling.

As per the OP "ECS Memory Reservation" seems not to even be an option any more (at least in the web console)

And when creating the Capacity Provider, only the target value is configurable.

There are more details into how this Capacity is calculated in this blog, but while it mentions:

This calculation accounts for vCPU, memory, ENI, ports, and GPUs of the tasks and the instances

If you have tasks that not necessarily grow memory consumption, but you have a service with scheduled actions configured to scale tasks (eg: minimum tasks at different times of day)

This case will not trigger a scale out, since the memory in the instances does not get to be used if the tasks simply does not fit in, due to its configuration and you will see errors (in the service events) like:

service myservice was unable to place a task because no container instance met all of its requirements. The closest matching container-instance abc123xxxx has insufficient memory available.

This basically mean a scheduled task scaling change may not happen if the task memory setting is just big enough so it doesn't fit in the running instances, and the CapacityProviderReservation does not change because the calculation is only done when tasks are in Provisioned state, which does not happen in this case.

Possible workarounds

Decrease the Capacity Reservation. This basically means "to have spare capacity", ie: by default Reservation is 100 (%) so it tries to use the ASG cluster resources as much as possible, so having a number less than 100, means it will scale out when the cluster is used at that capacity therefore having a margin spare of resources at all times, which means new scheduled tasks will fit in, as long as the spare is enough (eg: calculate per task memory reservation and cluster memory reservation of all expected running tasks)
Setup ASG rules for scaling that match the service scaling rules. While possible, this may be bound for problems with timing and auto scaling due to other triggers.

Upvotes: 1

Shahad

Reputation: 781

A few things:

Cluster AutoScaling usually is just the term ECS uses for "An AutoScaling Group that launches instances into the cluster", and it sounds like that's what you are currently using. Capacity Providers are a newer feature where ECS more directly manages the ASG, which might be the newer feature you're thinking onf
'Desired Capacity' isn't a state that you set for where you want the group to be, its the current amount of capacity that AutoScaling wants there to be in the group. So if a scaling policy goes off and says +1, the desired will change to 2, and then AutoScaling will try to launch an instance since you presumably only had 1 before (since the desired was 1 before)
Memory reservation is based on that 2GB's reserved, so it doesn't mater how much is in use for scaling purposes. This is importaint because even if you had 6/8GB reserved (from 3 2GB tasks), but 7.5Gb in use, ECS would still allow another task to be launched, since there's still 2 reservable GBs
Because of 3) you should probably increase the reservation value, wouldn't want an instance to get overloaded. Java can be nasty about RAM issues. This would also help with your scale out threshold issue.
For your second question, scaling will only happen after the cloudwatch alarm is triggered. So if the metric never goes above that threshold, alarm can't trigger the scaling policy. There are a whole host of cases where just because the alarm triggers, scaling won't happen (more of them for scaling in than scaling out, but it can still happen on scale out too); but the alarm going into the Alarm state is definitely a required step.

Upvotes: 0

AWS ECS Scaling based on memoryreservation

Answers (2)

Possible workarounds

Related Questions