Alberto Chiusole
Alberto Chiusole

Reputation: 2784

Extract details for past jobs in SLURM

In PBS, one can query a specific job with qstat -f and obtain (all?) info and details to reproduce the job:

# qstat -f 1234
Job Id: 1234.login
    Job_Name = job_name_here
    Job_Owner = user@pbsmaster
    ...
    Resource_List.select = 1:ncpus=24:mpiprocs=24
    Resource_List.walltime = 23:59:59
    ...
    Variable_List = PBS_O_HOME=/home/user,PBS_O_LANG=en_US.UTF-8,
    PBS_O_LOGNAME=user,...
    etime = Mon Apr 20 16:38:27 2020
    Submit_arguments = run_script_here --with-these flags

How may I extract the same information from SLURM?
scontrol show job %j only works for currently running jobs or those terminated up to 5 minutes ago.

Edit: I'm currently using the following to obtain some information, but it's not as complete as a qstat -f:

sacct -u $USER \
      -S 2020-05-13 \
      -E 2020-05-15 \
      --format "Account,JobID%15,JobName%20,State,ExitCode,Submit,CPUTime,MaxRSS,ReqMem,MaxVMSize,AllocCPUs,ReqTres%25"

.. usually piped into |(head -n 2; grep -v COMPLETED) |sort -k12 to inspect only failed runs.

Upvotes: 3

Views: 6652

Answers (1)

Maarten-vd-Sande
Maarten-vd-Sande

Reputation: 3701

You can get a list of all jobs that started before a certain date like so:

sacct --starttime 2020-01-01

Then pick the job you are interested (e.g. job 1234) and print details with sacct:

sacct -j 1234 --format=User,JobID,Jobname,partition,state,time,start,end,elapsed,MaxRss,MaxVMSize,nnodes,ncpus,nodelist

See here under --helpformat for a complete list of available fields.

Upvotes: 5

Related Questions