Alan Leavy
Alan Leavy

Reputation: 21

slurm sacct not returning values for cpu or memory usage (e.g. AveCPU, MaxRSS)

I want to see cpu / memory usage for my slurm jobs but sacct isn't showing it.

Here is an example of the output for a test job:

root@slurmctld:/# sacct -j 2 -o jobid,maxrss,avecpu,reqtres%30,alloctres%30,elapsed
JobID            MaxRSS     AveCPU                        ReqTRES                      AllocTRES    Elapsed 
------------ ---------- ---------- ------------------------------ ------------------------------ ---------- 
2                                   billing=1,cpu=1,mem=3M,node=1  billing=1,cpu=1,mem=3M,node=1   00:01:14 
2.batch                                                                      cpu=1,mem=3M,node=1   00:01:14

The accounting setup in my slurm.conf looks like this:

# ACCOUNTING
JobAcctGatherType=jobacct_gather/linux
JobAcctGatherFrequency=30
AccountingStorageType=accounting_storage/slurmdbd
AccountingStorageHost=localhost
AccountingStoragePort=6819
AccountingStoragePass=/var/run/munge/munge.socket.2
AccountingStorageUser=slurm

Looking at the the underlying mariaDB that slurmdbd is using, I can see records being written there.

The test job runs a small C program which mallocs blocks of memory and writes random data into them. Within that program I call getrusage() to see what the OS is reporting. I get this:

tv_sec: 63
utime.tv_usec:  645622
blocks in:      32
blocks out:     0
maxrss:         7814184

My cluster is running on ubuntu:22.04 containers under kubernetes. According to sinfo -V, the slurm version is slurm-wlm 21.08.5

Am I perhaps missing some additional configuration required to get stats like AveCPU and MaxRSS from sacct?

Upvotes: 0

Views: 336

Answers (0)

Related Questions