Reputation: 31
In AWS Batch, when I specify a memory requirement of e.g. 32000MB, my job ends up getting killed because (a) the actual instance autoselected has 64GB memory and (b) ECS seems to view 32000MB as both a requirement and a hard limit ("If your container attempts to exceed the memory specified here, the container is killed" from https://docs.aws.amazon.com/batch/latest/userguide/job_definition_parameters.html). So as soon as my job goes slightly above 32GB, it gets killed, even though I am happy for it to use up to the 64GB.
How do I properly specify a minimum memory requirement without causing AWS Batch to kill jobs that go slightly above that? It seems very strange to me that the "memory" parameter appears to be both a minimum and a maximum.
I assume I'm misunderstanding something.
Upvotes: 3
Views: 6535
Reputation: 5796
The memory requirements in the resourceRequirements
property are always the maximum/upper bound. You specify there how much memory at max your job container is going to use.
Quote from https://docs.aws.amazon.com/batch/latest/userguide/job_definition_parameters.html :
The hard limit (in MiB) of memory to present to the container. If your container attempts to exceed the memory specified here, the container is killed.
A lower/minimum bound would not make much sense, since AWS needs to put your job container onto a host that actually supports the upper bound/limit, because there is no way to tell a priori how much actual memory your container is going to use.
Or put another way: If there were such a thing as a "minimum" requirement and you specified minimum = 1 MiB
and maximum = 16 GiB
, what is AWS Batch supposed to do with this information? It cannot put your job container onto a host with say 512 MiB
of memory because your job container, as it runs, may exceed that, since you said the maximum was 16 GiB
(in this example). And AWS Batch is not going to freeze a running job and migrate it onto another host, once the current host's memory is reached.
The fact that AWS Batch decided to put your concrete job container onto an instance with 64 GiB
may be coincidental because 32 GiB
is just the border of an instance's memory size 32 GiB <-> 64 GiB
. And if your job were to use the full 32 GiB
then there wouldn't be any memory left for the host (without swapping).
Upvotes: 2