Reputation: 11
Let's say my job was running for some time and it went to suspend state due to machine overloading and became running after sometime and got completed. Now the status acquired by this job were RUNNING -> SUSPEND -> RUNNING
How to get all states acquired by a given job ?
Upvotes: 1
Views: 535
Reputation: 932
bjobs -l If the job hasn't been cleaned from the system yet.
bhist -l Otherwise. You might need -n, depending on how old the job is.
Here's an example of bhist -l output when a job was suspended and later resumed because the system load temporarily exceeded the configured threshold.
$ bhist -l 1168
Job <1168>, User <mclosson>, Project <default>, Command <sleep 10000>
Fri Jan 20 15:08:40: Submitted from host <hostA>, to
Queue <normal>, CWD <$HOME>, Specified Hosts <hostA>;
Fri Jan 20 15:08:41: Dispatched 1 Task(s) on Host(s) <hostA>, Allocated 1 Slot(
s) on Host(s) <hostA>, Effective RES_REQ <select[type == any] or
der[r15s:pg] >;
Fri Jan 20 15:08:41: Starting (Pid 30234);
Fri Jan 20 15:08:41: Running with execution home </home/mclosson>, Execution CW
D </home/mclosson>, Execution Pid <30234>;
Fri Jan 20 16:19:22: Suspended: Host load exceeded threshold: 1-minute CPU ru
n queue length (r1m)
Fri Jan 20 16:21:43: Running;
Summary of time in seconds spent in various states by Fri Jan 20 16:22:09
PEND PSUSP RUN USUSP SSUSP UNKWN TOTAL
1 0 4267 0 141 0 4409
At 16:19:22 the jobs was suspended because r1m exceeded the threshold. Later at 16:21:43 the job resumes.
Upvotes: 0