jc315
jc315

Reputation: 107

Annotating Django querysets with ForeignKey Counts subject to conditions

Here is a simplified version of my models:

class Airport(models.Model):
    iata = models.CharField()
    name = models.CharField()
    latitude = models.FloatField()
    longitude = models.FloatField()

class Flight(models.Model):
    origin = models.ForeignKey('Airport', related_name='origins')
    destination = models.ForeignKey('Airport', related_name='destinations')
    owner = models.ForeignKey(User)

Given a User, I would like to create a list of all the Airport objects that appear in either the origin or destination fields of the Flight objects he owns, each annotated with the corresponding number of Flight objects.

For example, suppose a user has been on 3 flights: LAX-LHR, LHR-CDG, and CDG-JFK. Then I would like a query which returns the following object:

[LHR, id__count=2}, {CDG, id__count=2}, {LAX, id__count=1}, {JFK, id__count=1}]

In the above, the three letter codes stand for Airport objects or all their fields.

Generally, there may be thousands of Users and tens of thousands of Airports and Flights, so I am seeking something more efficient than the obvious solution with for loops and if statements, preferably in a single database query.

My current progress is this query:

Airport.objects.filter(
    Q(origins__owner=user) | Q(destinations__owner=user)
)
.distinct()
.annotate(
    id__count=Count('origins', distinct=True) + Count('destinations', distinct=True)
).order_by('-id__count')

This works perfectly with only one user, because the initial filter only keeps those airports which appear somewhere in his flights. But it clearly fails when their are multiple users, because the counts include every user's flights. I need some way to only Count those Flight objects which obey a certain property, namely owner=user where user is a certain User object.


Edit: after reading this page in the Djnago documentation, it seems that putting the filter first should make this work as needed. But it doesn't, at least when I use Q objects. I've found the following highly confusing result.

When I use this query, i.e. only looking at the origins, then it works, and the num_origins field counts only those flights belonging to the specified user:

Airport.objects.filter(origins__owner=user).annotate(num_origins=Count('origins'))

(This isn't exactly what I need because the counts only include flights whose origin is a certain Airport, but it does filter the Users correctly.)

But, when I do nothing but replace a single filter with two Q objects combined with or, i.e.

Airport.objects.filter(Q(origins__owner=user) | Q(destinations__owner=user)).annotate(num_origins=Count('origins'))

now it counts flights belonging to every user! It seems that the annotate "forgets" about the filter when it uses Q objects. What is going on here?

Upvotes: 3

Views: 3516

Answers (2)

solarissmoke
solarissmoke

Reputation: 31434

I think you can achieve this with conditional expressions:

from django.db.models import Case, When

Airport.objects.filter(
    Q(origins__owner=user) | Q(destinations__owner=user)
).annotate(
    num_origins=Count(
        Case(When(Q(origin__owner=user), then=1),output_field=CharField()),
    ),
    num_destinations=Count(
        Case(When(Q(destination__owner=user), then=1),output_field=CharField()),
    )
)

Note that the When clause is repeating the same filter that you do initially. It might actually be more efficient to do this instead (you probably need to inspect the resulting SQL query to find out):

Airport.objects.annotate(
    num_origins=Count(
        Case(When(Q(origin__owner=user), then=1), output_field=CharField()),
    ),
    num_destinations=Count(
        Case(When(Q(destination__owner=user), then=1),output_field=CharField()),
    )
).filter(Q(num_origins__gt=0) | Q(num_destinations__gt=0))

i.e., annotate all flights, and then filter out the ones where the count was 0.

You can then add up num_origins and num_destinations in Python.

If you are using Django 2, then it is simpler still because you can pass a filter argument to Count:

Airport.objects.annotate(
    num_origins=Count('origins', filter=Q(origin__owner=user), distinct=True),
    num_destinations=Count('destinations', filter=Q(destination__owner=user), disctinct=True)
).filter(Q(num_origins__gt=0) | Q(num_destinations__gt=0))

Upvotes: 5

Cagatay Barin
Cagatay Barin

Reputation: 3496

Can you try like this? I didn't test it on shell, so I'm not sure about 'distinct_flights' list structure but you'll get the idea.

# This is all of the distinct flights of your users.
distinct_flights = Flight.objects.filter(owner__in=[user1.id, user2.id]).distinct().values_list('origin','destination')

# This is all of the airports included in the flights above. 
Airport.objects.filter(
    Q(origins__in=distinct_flights['origin'])||
    Q(destination__in=distinct_flights['destination'])
)

# The rest is annotation from those airports as you did before. You can annotate it on the above query again.

Upvotes: 0

Related Questions