Reputation: 15620
I have a usecase where I have to count occurences of a ManyToManyField but its getting more complex than I'd think.
models.py:
class Tag(models.Model):
name = models.CharField(max_length=100, unique=True)
class People(models.Model):
tag = models.ManyToManyField(Tag, blank=True)
Here I have to come up with a list of Tags
and the number of times they appear overall but only for those People
who have >0 and <6 tags. Something like:
tag1 - 265338
tag2 - 4649303
tag3 - 36636
...
This is how I came up with the count initially:
q = People.objects.annotate(tag_count=Count('tag')).filter(tag_count__lte=6, tag_count__gt=0)
for tag in Tag.objects.all():
cnt = q.filter(tag__name=tag.name).count()
# doing something with the cnt
But I later realised that this may be inefficient since I am probably iterating through the People
model many times (Records in People are way larger than those in Tag).
Intuitively I think I should be able to do one iteration of the Tag
model without any iteration of the People
model. So then I came up with this:
for tag in Tag.objects.all():
cnt = tag.people_set.annotate(tag_count=Count('tag')).filter(tag_count__lte=6).count()
# doing something with the cnt
But, first, this is not producing the expected results. Second, I am thinking this has become more complex that it seemed to be, so perhaps I am complicating a simple thing. All ears to any advice.
Update: I got queryset.query and ran the query on the db to debug it. For some reason, the tag_count column in the resulting join shows all 1's. Can't seem to understand why.
Upvotes: 0
Views: 1022
Reputation: 33
Can be done using reverse ManyToMany field query.
Would also reduce the overhead, and shift most of overhead from python to database server.
from some_app.models import Tag, People
from django.db.models import F, Value, Count, CharField
from django.db.models.functions import Concat
# queryset: people with tags >0 and <6, i.e. 1 to 5 tags
people_qualified = People.objects.annotate(tag_count=Count('tag'))\
.filter(tag_count__range=(1, 5))
# query tags used with above category of people, with count
tag_usage = Tag.objects.filter(people__in=people_qualified)\
.annotate(tag=F('name'), count=Count('people'))\
.values('tag', 'count')
# Result: <QuerySet [{'count': 3, 'tag': u'hello'}, {'count': 2, 'tag': u'world'}]>
# similarily, if needed the string output
tag_usage_list = Tag.objects.filter(people__in=people_qualified)\
.annotate(tags=Concat(F('name'), Value(' - '), Count('people'),
output_field=CharField()))\
.values_list('tags', flat=True)
# Result: <QuerySet [u'hello - 3', u'world - 2']>
Upvotes: 2