Reputation: 3410
My goal is to charge users based on the time(in seconds) they allocated the CPU. What is best parameter to measure it?
The way I run:
Example 1:
sbatch -N1 run.sh
Submitted batch job 20
scontrol update jobid=20 TimeLimit=0-00:01
sacct -o totalcpu,cputime,cputimeraw,Elapsed,SystemCPU,time -j 20
TotalCPU CPUTime CPUTimeRAW Elapsed SystemCPU Timelimit
---------- ---------- ---------- ---------- ---------- ----------
00:00:00 00:11:52 712 00:01:29 00:01:00
00:00:00 00:11:52 712 00:01:29
I had put a time limit as 1 minute, but it seems like it exceeds the time limit for 29 seconds. Is it normal?
Example 2:
sbatch -N1 run.sh
Submitted batch job 21
scontrol update jobid=21 TimeLimit=0-00:02
sacct -o totalcpu,cputime,cputimeraw,Elapsed,SystemCPU,time -j 21
TotalCPU CPUTime CPUTimeRAW Elapsed SystemCPU Timelimit
---------- ---------- ---------- ---------- ---------- ----------
00:00:00 00:18:56 1136 00:02:22 00:02:00
I had put a time limit as 2 minute, but it seems like it exceeds the time limit for 22 seconds. Is it normal?
How could I convert {CPUTimeRAW
and CPUTime
} into real time
as seconds? Based on the examples I have shown, I wasn't able to find the relationship between them.
CPUTimeRaw = Units are cpu-seconds.
Upvotes: 0
Views: 1866
Reputation: 1169
The small overrun of the time limit is normal, this is determined by the KillWait
flag in slurm.conf
:
The interval, in seconds, given to a job's processes between the SIGTERM and SIGKILL signals upon reaching its time limit. If the job fails to terminate gracefully in the interval specified, it will be forcibly terminated. The default value is 30 seconds.
For charging users:
CPUTime = (Elapsed time) x (the number of CPUs allocated)
so CPUTime
(or CPUTimeRaw
, the same usage expressed in seconds) is what they actually used and what they can be charged for.
Upvotes: 1