espenhogbakk
espenhogbakk

Reputation: 11728

Django objects.filter, how "expensive" would this be?

I am trying to make a search view in Django. It is a search form with freetext input + some options to select, so that you can filter on years and so on. This is some of the code I have in the view so far, the part that does the filtering. And I would like some input on how expensive this would be on the database server.

soknad_list = Soknad.objects.all()

if var1:
    soknad_list = soknad_list.filter(pub_date__year=var1)

if var2:
    soknad_list = soknad_list.filter(muncipality__name__exact=var2)

if var3:
    soknad_list = soknad_list.filter(genre__name__exact=var3)

# TEXT SEARCH
stop_word_list = re.compile(STOP_WORDS, re.IGNORECASE)
search_term = '%s' % request.GET['q']
cleaned_search_term = stop_word_list.sub('', search_term)
cleaned_search_term = cleaned_search_term.strip()
if len(cleaned_search_term) != 0:
    soknad_list = soknad_list.filter(Q(dream__icontains=cleaned_search_term) | Q(tags__icontains=cleaned_search_term) | Q(name__icontains=cleaned_search_term) | Q(school__name__icontains=cleaned_search_term))

So what I do is, first make a list of all objects, then I check which variables exists (I fetch these with GET on an earlier point) and then I filter the results if they exists. But this doesn't seem too elegant, it probably does a lot of queries to achieve the result, so is there a better way to this?

It does exactly what I want, but I guess there is a better/smarter way to do this. Any ideas?

Upvotes: 2

Views: 2455

Answers (3)

anon
anon

Reputation:

As Aaron mentioned, you should get a hold of the query text that is going to be run against the database and use an EXPLAIN (or other some method) to view the query execution plan. Once you have a hold of the execution plan for the query you can see what is going on in the database itself. There are a lot of operations that see very expensive to run through procedural code that are very trivial for any database to run, especially if you provide indexes that the database can use for speeding up your query.

If I read your question correctly, you're retrieving a result set of all rows in the Soknad table. Once you have these results back you use the filter() method to trim down your results meet your criteria. From looking at the Django documentation, it looks like this will do an in-memory filter rather than re-query the database (of course, this really depends on which data access layer you're using and not on Django itself).

The most optimal solution would be to use a full-text search engine (Lucene, ferret, etc) to handle this for you. If that is not available or practical the next best option would be to to construct a query predicate (WHERE clause) before issuing your query to the database and let the database perform the filtering.

However, as with all things that involve the database, the real answer is 'it depends.' The best suggestion is to try out several different approaches using data that is close to production and benchmark them over at least 3 iterations before settling on a final solution to the problem. It may be just as fast, or even faster, to filter in memory rather than filter in the database.

Upvotes: -1

Aaron
Aaron

Reputation: 838

You can see the query that will be generated by using:

soknad_list.query.as_sql()[0]

You can then put that into your database shell to see how long the query takes, or use EXPLAIN (if your database backend supports it) to see how expensive it is.

Upvotes: 2

hasen
hasen

Reputation: 166282

filter itself doesn't execute a query, no query is executed until you explicitly fetch items from query (e.g. get), and list( query ) also executes it.

Upvotes: 4

Related Questions