Django filter() Increases Result QuerySet Rather than Reducing

Question

In testing my Django application I ran across a performance issue with a particular query when a particular filter() was used. The issue was solved, but I don't understand why.

I'm filtering a QuerySet based on 3 criteria. When including them all at once, I get the expected number of results. When moving one of the clauses to its own filter() call, the resulting QuerySet is much, much larger than it should be and is filled with seemingly duplicate entries.

See below for an illustration of this. Moving the result__is_false=False clause to its own filter() causes a huge expansion in the resulting QuerySet.

>>> Finding.objects.filter(project=1, result__scan_session__is_enabled=True,\ 
    result__is_false=False).count()
3566

>>> Finding.objects.filter(project=1, result__scan_session__is_enabled=True)\
    .filter(result__is_false=False).count()
10050380

The problem was that I needed to conditionally apply the filter in question, not every time. I solved the problem by converting the filter() routine to an exclude() routine like this:

>>> Finding.objects.filter(project=1, result__scan_session__is_enabled=True)\
    .exclude(result__is_false=True).count()
3566

That seemed to have done the trick!

While I'm happy that this worked, I don't understand two things:

1) Why posts explaining filter() and exclude() seem to suggest that they can be treated as equal but opposite, when based on the example above, that is not always the case.

2) Why moving the result__is_false=False clause to its own filter() caused the huge jump in results in the first place.

Does anyone have any insight into either of these?

Django filter() Increases Result QuerySet Rather than Reducing

Answers (1)

Related Questions