jack
jack

Reputation: 17901

Tracking Django/FastCGI Process Errors

I run a Django based site on Nginx with FastCGI server. The site generally works great. But every 2-3 days, the site run into unknown problem and stop responding to any requests.

Munin graphs shows IO blocks read & write per second increases 500% during the problem.

I also wrote a python script to record the the following stats every one minute.

Load Averages
CPU Usage (user, nice, system, idle, iowait)
RAM Usage
Swap Usage
Number of FastCGI Processes
RAM Used by FastCGI Processes

The record shows during the problem, number of FastCGI processes doubled (from normal value of 10-15 to 25-30). And the RAM usage by FastCGI processes also doubled (from 17% to 35% of total RAM on server). The memory usage increase required more swap to be used so it slow down disk IO made the server unresponsive.

FastCGI parameters I used:

maxspare=10 minspare=5 maxchildren=25 maxrequests=1000 

I guess the problem is due to poorly written Python code in some part of my site. But I just don't know how to find out which part of the code froze existing FastCGI processes and forking new instances.

Upvotes: 0

Views: 283

Answers (1)

Andrew Wilkinson
Andrew Wilkinson

Reputation: 10846

You've limited the number of children to 25 so when there are 25 processes running and processing requests any further ones will block and the site will appear to not be responding.

It sounds to me like you have an infinite (or very long) loop that is causing the processes to block. I suggest you add an idle-timeout to the FastCGI script. This will hopefully have the effect of allowing the site to continue by killing long running queries, and will let you debug the problem by sending tracebacks from the where the processes were killed.

Upvotes: 1

Related Questions