Blue Whale
Blue Whale

Reputation: 21

PBS: job on two nodes uses memory of only one

I am trying to run a job (python code) on cluster using MPI. There is 63GB of memory available on each node. When I run it on one node, I specify PBS parameters with (only relevant parameters are listed here):

#PBS -l mem=60GB
#PBS -l nodes=node01.cluster:ppn=32
time mpiexec -n 32 python code.py

Than works just fine.

Since PBS man page says mem is memory per entire job, my parameters when trying to run it on two nodes, are

#PBS -l mem=120GB
#PBS -l nodes=node01.cluster:ppn=32+node02.cluster:ppn=32
time mpiexec -n 64 python code.py

This doesn't work (qsub: Job exceeds queue resource limits MSG=cannot satisfy queue max mem requirement). It fails even if I set mem=70GB for example (in case system needs some more memory). If I set mem=60GB when trying to use both nodes, I get

=>> PBS: job killed: mem job total xx kb exceeded limit yy kb.

I tried it with pmem as well (that's pmem=1875MB), but no success.

My question is: How can I use entire 120GB of memory?

Upvotes: 2

Views: 1881

Answers (1)

Hristo Iliev
Hristo Iliev

Reputation: 74395

Torque / PBS ignores the mem resource unless the job uses a single node (see here):

Maximum amount of physical memory used by the job. (Ignored on Darwin, Digital Unix, Free BSD, HPUX 11, IRIX, NetBSD, and SunOS. Also ignored on Linux if number of nodes is not 1. Not implemented on AIX and HPUX 10.)

You should instead use the pmem resource that limits the memory per job process. With ppn=32 you should set pmem to 1920MB in order to get 60 GB per node. In that case you should mind that pmem does not allow flexible distribution of memory between the processes running on the node the same way mem does (since the latter is accounted as an aggregated value while pmem applies to each process individually).

Upvotes: 2

Related Questions