DantheMan
DantheMan

Reputation: 7417

django most efficient way to count same field values in a query

Lets say if I have a model that has lots of fields, but I only care about a charfield. Lets say that charfield can be anything so I don't know the possible values, but I know that the values frequently overlap. So I could have 20 objects with "abc" and 10 objects with "xyz" or I could have 50 objects with "def" and 80 with "stu" and i have 40000 with no overlap which I really don't care about.

How do I count the objects efficiently? What I would like returned is something like:

{'abc': 20, 'xyz':10, 'other': 10,000}

or something like that, w/o making a ton of SQL calls.

EDIT:

I dont know if anyone will see this since I am editing it kind of late, but...

I have this model:

class Action(models.Model):
    author = models.CharField(max_length=255)
    purl = models.CharField(max_length=255, null=True)

and from the answers, I have done this:

groups = Action.objects.filter(author='James').values('purl').annotate(count=Count('purl'))

but...

this is what groups is:

{"purl": "waka"},{"purl": "waka"},{"purl": "waka"},{"purl": "waka"},{"purl": "mora"},{"purl": "mora"},{"purl": "mora"},{"purl": "mora"},{"purl": "mora"},{"purl": "lora"}

(I just filled purl with dummy values)

what I want is

{'waka': 4, 'mora': 5, 'lora': 1}

Hopefully someone will see this edit...

EDIT 2:

Apparently my database (BigTable) does not support the aggregate functions of Django and this is why I have been having all the problems.

Upvotes: 66

Views: 52459

Answers (4)

alstr
alstr

Reputation: 1556

Unless your field value is always guaranteed to be in a specific case, it may be useful to transform it prior to performing a count, i.e. so 'apple' and 'Apple' would be treated as the same.

from django.db.models import Count
from django.db.models.functions import Lower

MyModel.objects.annotate(lower_title=Lower('title')).values('lower_title').annotate(num=Count('lower_title')).order_by('num')

Upvotes: 5

beerbajay
beerbajay

Reputation: 20270

You want something similar to "count ... group by". You can do this with the aggregation features of django's ORM:

from django.db.models import Count

fieldname = 'myCharField'
MyModel.objects.values(fieldname)
    .order_by(fieldname)
    .annotate(the_count=Count(fieldname))

Previous questions on this subject:

Upvotes: 113

Daniel Roseman
Daniel Roseman

Reputation: 599480

This is called aggregation, and Django supports it directly.

You can get your exact output by filtering the values you want to count, getting the list of values, and counting them, all in one set of database calls:

from django.db.models import Count
MyModel.objects.filter(myfield__in=('abc', 'xyz')).\
        values('myfield').annotate(Count('myfield'))

Upvotes: 23

Manoj Govindan
Manoj Govindan

Reputation: 74675

You can use Django's Count aggregation on a queryset to accomplish this. Something like this:

from django.db.models import Count
queryset = MyModel.objects.all().annotate(count = Count('my_charfield'))
for each in queryset:
    print "%s: %s" % (each.my_charfield, each.count)

Upvotes: 7

Related Questions