John
John

Reputation: 4261

Getting "idle in transaction" for postgresql with django

We are using Django 1.3.1 and Postgres 9.1

I have a view which just fires multiple selects to get data from the database.

In Django documents it is mentioned, that when a request is completed then ROLLBACK is issued if only select statements were fired during a call to a view. But, I am seeing lot of "idle in transaction" in the log, especially when I have more than 200 requests. I don't see any commit or rollback statements in the postgres log.

What could be the problem? How should I handle this issue?

Upvotes: 5

Views: 5146

Answers (2)

Nagyman
Nagyman

Reputation: 599

I understand this is an older question, but this article may describe the problem of idle transactions in django.

Essentially, Django's TransactionMiddleware will not explicitly COMMIT a transaction if it is not marked dirty (usually triggered by writing data). Yet, it still BEGINs a transaction for all queries even if they're read only. So, pg is left waiting to see if any more commands are coming and you get idle transactions.

The linked article shows a small modification to the transaction middleware to always commit (basically remove the condition that checks if the transaction is_dirty). I'll be trying this fix in a production environment shortly.

Upvotes: 2

Mark Stosberg
Mark Stosberg

Reputation: 13381

First, I would check out the related post What does it mean when a PostgreSQL process is “idle in transaction”? which covers some related ground.

One cause of "Idle in transaction" can be developers or sysadmins who have entered "BEGIN;" in psql and forgot to "commit" or "rollback". I've been there. :)

However, you mentioned your problem is related to have a lot of concurrent connections. It sounds like investigating the "locks" tip from the post above may be helpful to you.

A couple more suggestions: this problem may be secondary. The primary problem might be that 200 connections is more than your hardware and tuning can comfortably handle, so everything gets slow, and when things get slow, more things are waiting for other things to finish.

If you don't have a reverse proxy like Nginx in front of your web app, considering adding one. It can run on the same host without additional hardware. The reverse proxy will serve to regulate the number of connections to the backend Django web server, and thus the number of database connections-- I've been here before with having too many database connections and this is how I solved it!

With Apache's prefork model, there is 1=1 correspondence between the number of Apache workers and the number of database connections, assuming something like Apache::DBI is in use. Imagine someone connects to the web server over a slow connection. The web and database server take care of the request relatively quickly, but then the request is held open on the web server unnecessarily long as the content is dribbled back to the client. Meanwhile, the database connection slot is tied up.

By adding a reverse proxy, the backend server can quickly delivery a repliy back to the reverse proxy and then free the backend worker and database slot.. The reverse proxy is then responsible for getting the content back to the client, possibly holding open it's own connection for longer. You may have 200 connections to the reverse proxy up front, but you'll need far fewer workers and db slots on the backend.

If you graph the db slots with MRTG or similar, you'll see how many slots you are actually using, and can tune down max_connections in PostgreSQL, freeing those resources for other things.

You might also look at pg_top to help monitor what your database is up to.

Upvotes: 5

Related Questions