Reputation: 153541
I have a slurm job like this:
#!/bin/bash
#SBATCH -o %A.%N.out
#SBATCH -e %A.%N.err
#SBATCH --partition=compute
#SBATCH --nodes=1
#SBATCH -n 16
#SBATCH --export=ALL
#SBATCH -t 1:00:00
cmd1 input1 > o1
cmd2 o1 > o2
cmd3 o2 > o3
With sacct
, one can get the time and cpu usage for the whole job. I am also interested to get those info for cmd1
and cmd3
specifically. How can you do that? Will job step and srun
help do that?
Upvotes: 1
Views: 433
Reputation: 5377
You can get a separate entry on sacct per step.
If you run your commands with srun they will generate a step and each one will be monitored and have its own entry.
After this you will see in the sacct output one line for the whole job, one for the batch step, and one for each of the steps on the script (srun/mpirun commands)
Upvotes: 2
Reputation: 18118
You can use time -v
to get advanced information about timing and resources used. Not that this refers to the binary /usr/bin/time
, not the shell built-in time
:
$ /usr/bin/time -v ls /
bin dev home lib64 media opt root sbin sys usr
boot etc lib lost+found mnt proc run srv tmp var
Command being timed: "ls /"
User time (seconds): 0.00
System time (seconds): 0.00
Percent of CPU this job got: 94%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.00
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 2136
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 126
Voluntary context switches: 1
Involuntary context switches: 1
Swaps: 0
File system inputs: 0
File system outputs: 0
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
You can prepend this to any command in your batch script.
Upvotes: 1