EMP
EMP

Reputation: 61971

Django hits the database for each filter() call

I have some Django 1.3 code that looks up many model instances in a loop, ie.

my_set = myinstance.subitem_set.all()

for value in values:
  existing = my_set.filter(attr_name=value)
  if len(existing) == 1:
     ...

This works, but profiling SQL queries shows that it hits the DB on each iteration. According to https://docs.djangoproject.com/en/1.3/ref/models/querysets/ iterating over the related items should eagerly load them, so I tried calling:

list(my_set)

However, this doesn't help. It does do a query to load all the sub-items, but then it still does an individual query for each sub-item inside the loop. How do I get it to use the cached set and not hit the DB each time? The DB is PostgreSQL 8.4.

Upvotes: 2

Views: 1566

Answers (1)

Andrey Fedoseev
Andrey Fedoseev

Reputation: 5382

The problem is in this line:

if len(existing) == 1:

From Django documentation:

len(). A QuerySet is evaluated when you call len() on it. This, as you might expect, returns the length of the result list.

Note: Don't use len() on QuerySets if all you want to do is determine the number of records in the set. It's much more efficient to handle a count at the database level, using SQL's SELECT COUNT(*), and Django provides a count() method for precisely this reason. See count() below.

So in your case it executes the query each time when you call len(existing). The more effective way is:

existing.count() == 1

This will also hit the database each time you call it but it will execute SELECT COUNT(*) which is faster.

Upvotes: 5

Related Questions