simona
simona

Reputation: 2181

slurm: produce stats job in log files

I use slurm to run jobs on a cluster. I would like to get stats about the job, such as used memory, number of processors and wall-time. I would like to get such information in the log file. I think that this was possible with LSF (if I remember correctly and I am not getting confused with some other platform).

Upvotes: 2

Views: 1614

Answers (1)

Colas
Colas

Reputation: 2076

You can get this information from the Slurm database, see https://slurm.schedmd.com/sacct.html or Find out the CPU time and memory usage of a slurm job. E.g. sacct --jobs=12345 --format=NCPUS,MaxRSS,CPUTime.

Note: you can add this to the epilog script. Here is an example of epilog.srun:

#!/bin/sh
TMPDIR="/local"

# Append job usage info to job stdout
stdoutfname=`scontrol show job ${SLURM_JOB_ID} --details | grep "StdOut=" | sed -e 's/.*StdOut=\([^\s][^\s]*\)/\1/'`

if [ -w "${stdoutfname}" ] && [ "${QTMPDIR}" != "" ]; then
  sacct --format JobID,jobname,AveCPUFreq,AveDiskRead,AveRSS,cputime,MaxDiskWrite  -j ${SLURM_JOB_ID} >> ${stdoutfname}

Alternatively, you can use /usr/bin/time -v <your command> inside of your script (with full path for time, see https://stackoverflow.com/a/774601/6352677). That will be in the logs, but will not exactly match Slurm's accounting values.

Upvotes: 2

Related Questions