Reputation: 3410
Using sacct
I want to obtain information about my completed jobs.
Answer mentions how could we obtain a job's information.
I have submitted a job name jobName.sh
which has jobID 176. After 12 hours and new 200 jobs came in, I want to check my job's (jobID=176) information and I obtain slurm_load_jobs error: Invalid job id specified
.
scontrol show job 176
slurm_load_jobs error: Invalid job id specified
And following line returns nothing: sacct --name jobName.sh
I assume there is a time-limit to keep previously submitted job's information that somehow previous jobs' information has been removed. Is there a limit for that? How could I make that limit very large value in order to prevent them to be deleted?
Please not that JobRequeue=0
is at slurm.conf.
Upvotes: 1
Views: 4352
Reputation: 3410
On Slurm documentation mentioned that:
MinJobAge The minimum age of a completed job before its record is purged from Slurm's active database. Set the values of MaxJobCount and to ensure the slurmctld daemon does not exhaust its memory or other resources. The default value is 300 seconds. A value of zero prevents any job record purging. In order to eliminate some possible race conditions, the minimum non-zero value for MinJobAge recommended is 2.
On my slurm.conf
file, MinJobAge
was 300 which is 5 minutes. That's why after 5 minutes each completed job's information was removed. I increased MinJobAge
's value in order to prevent the delete operation.
Upvotes: 2
Reputation: 4571
Assuming that you are using mySQL to store that data, in your database configuration file slurmdbd.conf, you can tune, among others, the purging time. Here you have some examples:
PurgeJobAfter=12hours
PurgeJobAfter=1month
PurgeJobAfter=24months
If not set (default), then job records are never purged.
More info.
Upvotes: 3