Reputation: 9603
We use a lot of custom Windows services in our applications. However, the one I'm currently working on has an infuriating problem: while the service keeps running, it simply stops functioning.
The Main method of the service is wrapped in a try/catch block, like this:
static void Main()
{
IRepository rep = new Repository();
ILogger log = LogManager.GetLogger(GetType().Name);
TimeSpan loadWindowStart = new TimeSpan(9, 0, 0);
TimeSpan loadWindowEnd = new TimeSpan(18, 0, 0);
foreach (SuppressionLoad sl in rep.GetSuppressionLoads().ToList())
{
try
{
// do stuff
}
catch(Exception ex)
{
// log error
}
}
}
The service also logs as it does stuff, and we can watch the logs fill up while it's busy.
Sometimes, however, the logs just stop. And activity elsewhere in the database suggests the entire service has stopped working. Checking in Services on the server, the service still shows a Status of "Started". It takes up almost zero system resources while it's in this state, although it's normally quite processor intensive. If you try and stop it, it just times out trying and, as far as we can tell, it never stops of its own accord. The process has to be killed in Task Manager.
There is nothing untoward in the log in the run up to these stalls. There is also nothing we can find in Event Viewer.
Since it doesn't log an error, I'm at a loss as to what's going on here, or what we can do to try and diagnose the fault from here. It's highly intermittent - it will often run for several days without problem before entering the state. What can we do to investigate what's going on?
Upvotes: 0
Views: 145
Reputation: 4860
It sounds like the issue could be anywhere and doesn't necessarily have much to do with code provided.
Suggestions on how to go about it
When service hangs, attach a debugger and take a look at threads and see where each one is. You may need to rebuild and run a debug version of your solution so that debugger has necessary contextual symbol data. Questions to ask:
Considering that it happens infrequently, and the field of what and where is wide open, it'll likely take a few iterations of having the problem trigger in order to narrow down the scope.
Upvotes: 1
Reputation: 3313
Matt; Obscure problems such as these are difficult to find in the best of conditions - if your service happens to use threads (which I assume it does), it becomes tremendously more difficult and you can't rely on global try/catch.
A simple thing to try would be NBug (no association). It will catch un-handled exceptions and give you some info about them. I don't think it will get you enough though.
The general way to find these sorts of things is log, log, log. You have to be able to come as close to recreating the problem as possible - you need logs that tell your entry points into each method, the variable values, exception stack traces if hit, how long you spent in each method, etc. There are some really good tools out there for logging some logging tools so I won't bother with recommending any. You can wrap your logging in a conditional compile switch so once you find your issue you won't suffer a performance hit when you turn it off.
Probably not the answer you wanted, but the only thing that has really worked for me over the years.
SteveJ
Upvotes: 1