Anupam
Anupam

Reputation: 15620

Reverse Count of ManytoManyField with a condition

I have a usecase where I have to count occurences of a ManyToManyField but its getting more complex than I'd think.

models.py:

class Tag(models.Model):
    name = models.CharField(max_length=100, unique=True)

class People(models.Model):
    tag = models.ManyToManyField(Tag, blank=True)

Here I have to come up with a list of Tags and the number of times they appear overall but only for those People who have >0 and <6 tags. Something like:

tag1 - 265338
tag2 - 4649303
tag3 - 36636
...

This is how I came up with the count initially:

q = People.objects.annotate(tag_count=Count('tag')).filter(tag_count__lte=6, tag_count__gt=0)    

for tag in Tag.objects.all(): 
    cnt = q.filter(tag__name=tag.name).count()
    # doing something with the cnt

But I later realised that this may be inefficient since I am probably iterating through the People model many times (Records in People are way larger than those in Tag).

Intuitively I think I should be able to do one iteration of the Tag model without any iteration of the People model. So then I came up with this:

for tag in Tag.objects.all(): 
    cnt = tag.people_set.annotate(tag_count=Count('tag')).filter(tag_count__lte=6).count()
    # doing something with the cnt

But, first, this is not producing the expected results. Second, I am thinking this has become more complex that it seemed to be, so perhaps I am complicating a simple thing. All ears to any advice.

Update: I got queryset.query and ran the query on the db to debug it. For some reason, the tag_count column in the resulting join shows all 1's. Can't seem to understand why.

Upvotes: 0

Views: 1022

Answers (1)

kaushal
kaushal

Reputation: 33

Can be done using reverse ManyToMany field query.

Would also reduce the overhead, and shift most of overhead from python to database server.

from some_app.models import Tag, People
from django.db.models import F, Value, Count, CharField
from django.db.models.functions import Concat

# queryset: people with tags >0 and <6, i.e. 1 to 5 tags
people_qualified = People.objects.annotate(tag_count=Count('tag'))\
               .filter(tag_count__range=(1, 5))

# query tags used with above category of people, with count
tag_usage = Tag.objects.filter(people__in=people_qualified)\
            .annotate(tag=F('name'), count=Count('people'))\
            .values('tag', 'count')
# Result: <QuerySet [{'count': 3, 'tag': u'hello'}, {'count': 2, 'tag': u'world'}]>

# similarily, if needed the string output
tag_usage_list = Tag.objects.filter(people__in=people_qualified)\
                .annotate(tags=Concat(F('name'), Value(' - '), Count('people'), 
                                      output_field=CharField()))\
                .values_list('tags', flat=True)
# Result: <QuerySet [u'hello - 3', u'world - 2']>

Upvotes: 2

Related Questions