Reputation: 8988
Last night one of the websites (.NET 4.0 forms) hosted on my Win 2008 R2 (IIS 7.5) Server started to time out throwing the following error for all connected users.
TYPE System.Web.HttpException
MESSAGE Request timed out.
DETAIL System.Web.HttpException (0x80004005): Request timed out.
The outage was confined to just one website within IIS, the others continued to work fine.
Unfortunately I was unable to identify why the website was timing out. Here are the steps I took:
First thing I did was look at the task manager which revealed normal CPU and memory usage. Network activity was also moderate.
I then opened IIS to look at the live connections under 'Worker Processes'. There were about 60 live connections, so it didn't look like anything DDoS related.
Checked database connectivity (hosted on a separate server), all fine!
I then reset the website on IIS. That didn't work
I tried to then do a complete iisreset
...still no luck :(
In the end (and under some duress) the only thing I could think to do to resolve this was to restart the server.
Restarting the server worked but I am nervous not knowing why this happened in the first place. Can anyone recommend any checks that I failed to carryout? Is there an official checklist for working through these sorts of IIS problems? I have reviewed the IIS logs but don't see anything unusual on the run up to the outage.
Any pointers or links to useful resources to help me understand and mitigate against this in future will be much appreciated.
EDIT
The only time I logged into the server that day was to add an additional web handler component (for remote deploy) to IIS Web Deploy. I'm doubtful this caused the outage as the server worked for for 6 hours after.
Upvotes: 2
Views: 1269
Reputation: 16878
Because iisreset
didn't helped and you had to restart whole machine, I would suspect it was a global resources shortage and mostly used website (or most resource consuming) was impacted. It could be because of not available RAM, network connections congestion due to some malfunctioning calls (for example a lot of CLOSE_WAIT
sockets exhausting connections pool, we've seen that in production because of malfunction of external service). It could be also one specific client problem, which was disconnected after machine restart so eventually the problem disappeared.
I would start from:
Historical analysis
Monitoring
\Processor(_Total_)\% Processor Time
, \.NET CLR Exceptions(_Global_)\# of Exceps Thrown / sec
, \Memory\Available MBytes
, \Web Service(Default Web Site)\Current Connections
(per each your site name), \ASP.NET v4.0.30319\Request Wait Time
, \ASP.NET v4.0.30319\Requests Current
, \ASP.NET v4.0.30319\Request Queued
, \Process(XXX)\Working Set
, \Process(XXX)\% Processor Time
(XXX per each w3wp process),\Network Interface(XXX)\Bytes total / sec
netstat -ano
to analyze network traffic (or TCPView tool even better)If all this will not lead you to any conclusion, create a Debug Diagnostic rule to create a memory dump of the process for long running requests and analyze it with WinDbg and PSSCor extension for .NET debugging.
Upvotes: 2