Reputation: 582
During a quartz job's execution, one of my activities may fail every time that it runs (every minute) for an hour or two because a dependent server is down for maintenance. I noticed that, when this happens, the job stops running and seems to unschedule itself without logging any exceptions that I can see. The job is still there as I have another job which runs and ensures that it is there with the assigned schedule I've given it, but the job itself ceases to execute. I'm assuming there's some threshold which removes a job which causes an exception x number of times in a row, but I'm hoping I can find a definitive answer for this.
I'm trying to convince the main developer to catch the exception and log it instead of throwing a generic exception and letting it bubble up, but until then, researching the issue is all I can do.
Here's the execution code, essentially. I also have the DisallowConcurrentExecution attribute set on the class itself. When this failure happens, it happens in less than 5 seconds, so I wouldn't expect that to come into play here:
public void Execute(IJobExecutionContext context)
{
_logger.Log("Starting synchronization.");
try
{
syncActivities();
}
catch (Exception ex)
{
_logger.Log("Error. ", ex);
throw;
}
finally
{
_logger.Log($"Completed synchronization.");
}
}
Upvotes: 2
Views: 2811
Reputation: 582
Once we upgraded to the latest version of quartz which provided some comprehensive logging, we saw that we occasionally had some errors in the constructor which led to Quartz automatically changing the state of our job triggers to ERROR or BLOCKED. We didn't see these in our logs because they were a part of the internal quartz logs. In order to account for this, we added checks for trigger states to our job manager and rescheduled jobs which were found to be in either state.
Upvotes: 2