Reputation: 21
I want to see cpu / memory usage for my slurm jobs but sacct isn't showing it.
Here is an example of the output for a test job:
root@slurmctld:/# sacct -j 2 -o jobid,maxrss,avecpu,reqtres%30,alloctres%30,elapsed
JobID MaxRSS AveCPU ReqTRES AllocTRES Elapsed
------------ ---------- ---------- ------------------------------ ------------------------------ ----------
2 billing=1,cpu=1,mem=3M,node=1 billing=1,cpu=1,mem=3M,node=1 00:01:14
2.batch cpu=1,mem=3M,node=1 00:01:14
The accounting setup in my slurm.conf looks like this:
# ACCOUNTING
JobAcctGatherType=jobacct_gather/linux
JobAcctGatherFrequency=30
AccountingStorageType=accounting_storage/slurmdbd
AccountingStorageHost=localhost
AccountingStoragePort=6819
AccountingStoragePass=/var/run/munge/munge.socket.2
AccountingStorageUser=slurm
Looking at the the underlying mariaDB that slurmdbd is using, I can see records being written there.
The test job runs a small C program which mallocs blocks of memory and writes random data into them. Within that program I call getrusage() to see what the OS is reporting. I get this:
tv_sec: 63
utime.tv_usec: 645622
blocks in: 32
blocks out: 0
maxrss: 7814184
My cluster is running on ubuntu:22.04 containers under kubernetes.
According to sinfo -V
, the slurm version is slurm-wlm 21.08.5
Am I perhaps missing some additional configuration required to get stats like AveCPU and MaxRSS from sacct?
Upvotes: 0
Views: 336