Reputation: 151
I have a simple test.ksh that I am running with the command:
sbatch test.ksh
I keep getting "JobState=FAILED Reason=NonZeroExitCode" (using "scontrol show job")
I have already made sure of the following:
Upvotes: 6
Views: 8601
Reputation: 11
Sometimes the issue is due to missing folders.
You can check the output job file locations using scontrol show job <PID>
and checking for StdOut and StdErr fields.
In my case the slurm
folder was missing.
Resolve it by creating the missing folder(s).
Upvotes: 0
Reputation: 347
In my case it was because my folder owner was root when I was actually using a second user. I made the mistake to create the folder as root in the home folder of a particular user. use chown user:usergroup foldername
and it fixes the problem
Upvotes: 0
Reputation: 151
I found out that I hadn't set --error and --output, which meant that the default was the current directory from which I was issuing the command.
The problem was that I didn't have sufficient privileges to write to the current directory.
The solution was to set the --error and --output to directories to a place where I had privileges.
Upvotes: 7