Reputation: 2775
Given a model that has more than one kind of connection to a related model (I will call the "parent" model), how could I annotate a queryset with a count of parent model objects that are linked through either connection without counting duplicates?
Consider an Article
model that has 2 links to a parent Publication
model that are very similar in meaning.
from django.db import models
class Publication(models.Model):
pass
class Article(models.Model):
publication = models.ForeignKey(Publication, related_name='publications')
owner = models.ForeignKey(Publication, related_name='owned_articles')
I want to serve a page that is a list of publications. A business requirement is that the number of articles that the publication wishes to take credit for are shown (these publications prefer a generous metric for counting). An article is considered part of the organization if either the "owner" or "publication" field points to it, but no articles should be counted more than once for a single publication. An article may be included in the count of 2 publications if publication
points to a different object than owner
.
I don't want to execute a query for every publication in the list.
Publication.objects.annotate(Count('publications'), Count('owned_articles'))
would be trivial. Then I will have count__publications
and count__owned_articles
.
My problem is that I can't tell how many articles in count__publications
were also counted in count__owned_articles
. Django doesn't allow me to cram a full queryset into Count
, so in this general case of needing extra control of what is counted a special mechanism is needed.
I have found this situation most similar to the question here:
Django annotate count with a distinct field
You could contrive this same general situation by intensifying that question's request by adding another related model in addition to InformationUnit
and asking for a count of unique usernames among both related models.
Upvotes: 2
Views: 967
Reputation: 8166
I believe the correct answer is using Count("publications", distinct=True)
, as described here:
https://docs.djangoproject.com/en/3.2/topics/db/aggregation/#combining-multiple-aggregations
Upvotes: 1
Reputation: 2775
(initial answer, answering my own question with a so-so solution)
The preferable approach would be to start with a Publication
queryset, however, I can manage to squeeze a solution out of the Django ORM by pivoting around the Article
queryset instead.
Consider as a solution to this problem:
exclusive_owners_qs = Article.objects.exclude(
publication=F('owner')
).annotate(Count('publication')).order_by('publication')
publications_qs = Article.objects.annotate(Count('owner')).order_by('owner')
With this, I can loop over the two querysets and add up the 2 numbers locally inside of python to get the correct counts.
This satisfies the requirements, but it's also not an elegant solution. Eliminating the need for a python loop would be ideal.
Upvotes: 1