Reputation: 64820
How do you diagnose problems with PostgreSQL performance?
I have a Django-based webapp using PostgreSQL as a database backend on Ubuntu 12, and under heavy load, the database seems to just disappear, causing the Django-interface to be unreachable and resulting in errors like:
django.db.utils.DatabaseError: error with no message from the libpq
django.db.utils.DatabaseError: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
What's odd is that the logs in /var/log/postgresql show nothing unusual. The only thing the log /var/log/postgresql/postgresql-9.1-main.log shows are lots of lines like:
2012-09-01 12:24:01 EDT LOG: unexpected EOF on client connection
Running top
shows that PostgreSQL doesn't seem to be consuming any CPU, even though service postgresql status
indicates it's still running.
Doing a 'service postgresql restart` temporarily fixes the problem, but the problem returns as soon as there's a lot of load on the database.
I've checked the dmesg and syslog, but I don't see anything that would explain what's wrong. What other logs should I check? How do I determine what's wrong with my PostgreSQL server?
Edit: My max_connections is set to 100. Although I am doing a lot of manual transactions. Reading up on Django's ORM behavior with PostgreSQL in manual mode, it looks like I may have to explicitly do connection.close(), which I'm not doing.
Upvotes: 2
Views: 5138
Reputation: 64820
I found this was due to Django's buggy Postgres-backend in combination with multi-processing. Essentially, Django doesn't properly close it's connections automatically, causing some weird behavior like tons of "idle in transaction" connections. I fixed it by adding connection.close()
to the end of my multi-processing launched functions and before certain queries that were throwing this error.
Upvotes: 4
Reputation: 95722
The program pg_ctl has some options that might help. (man pg_ctl
)
-c
Attempt to allow server crashes to produce core files, on platforms
where this is possible, by lifting any soft resource limit placed
on core files. This is useful in debugging or diagnosing problems
by allowing a stack trace to be obtained from a failed server
process.
-l filename
Append the server log output to filename. If the file does not
exist, it is created. The umask is set to 077, so access to the log
file is disallowed to other users by default.
The program postgres also has some debug options. (man postgres
)
-d debug-level
Sets the debug level. The higher this value is set, the more
debugging output is written to the server log. Values are from 1 to
5. It is also possible to pass -d 0 for a specific session, which
will prevent the server log level of the parent postgres process
from being propagated to this session.
In the section "Semi-internal Options" . . .
-n
This option is for debugging problems that cause a server process
to die abnormally. The ordinary strategy in this situation is to
notify all other server processes that they must terminate and then
reinitialize the shared memory and semaphores. This is because an
errant server process could have corrupted some shared state before
terminating. This option specifies that postgres will not
reinitialize shared data structures. A knowledgeable system
programmer can then use a debugger to examine shared memory and
semaphore state.
-T
This option is for debugging problems that cause a server process
to die abnormally. The ordinary strategy in this situation is to
notify all other server processes that they must terminate and then
reinitialize the shared memory and semaphores. This is because an
errant server process could have corrupted some shared state before
terminating. This option specifies that postgres will stop all
other server processes by sending the signal SIGSTOP, but will not
cause them to terminate. This permits system programmers to collect
core dumps from all server processes by hand.
Upvotes: 1
Reputation: 45910
2012-09-01 12:24:01 EDT LOG: unexpected EOF on client connection
This message shows, so some issue is on client side - maybe some exception from libpq ?? There can be related issues - when clients hangs without correct logout, then you have lot of idle connections, and you get other errors early.
Upvotes: 1