Reputation: 2572
I ran a job in a SLURM cluster, and for a while, the job was running just fine. The last time I used the queue command squeue
it reported:
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
2394852 serial_re CombineP user_1 R 22:29 1 bigcluster112
However, I just checked it and it now says:
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
2394852 serial_re CombineP user_1 PD 0:00 1 (Priority)
and I got an email saying the job has been "PREEMPTED". I searched online and it says that when there is a high priority job, the low priority one will stop while the high priority one runs. This is on a shared university cluster. I didn't run any other jobs. Does this mean someone else just ran a job that now put mine into a low priority one? How does one set or beat that priority? Thanks!
Upvotes: 3
Views: 5435
Reputation: 59260
Yes someone submitted a job with a higher priority, or with a QOS that has preemption rights over other QOSes, or to a partition that has preemption rights over other partitions.
Look for the word 'Preempt' in the output of scontrol show config
, scontrol show partitions
and sacctmgr list qos
for more information.
To know how the priority is computed, have a look a the output of scontrol show config | grep Priority
and look for the corresponding keywords in the slurm.conf manpage.
Upvotes: 3