Hassan Baig
Hassan Baig

Reputation: 15824

Calculating frequency of occurrence as a percentage of total, in Django queryset

In a Django app, I have a queryset for a data model called Comment. This contains text comments left by users.

Imagine 'n' users commented. What's the fastest way to calculate what % of comments were left by which user?

Right now I'm thinking it's going to be:

Comment.objects.filter(commenter=<username>).count()/Comment.objects.count()

What do you suggest? My objective is to flag people who're commenting too much, in order to screen their accounts for possible spamming. I'd be running this query voluminously, hence the focus on performance.

Upvotes: 0

Views: 1326

Answers (1)

solarissmoke
solarissmoke

Reputation: 31404

You should avoid making one query for each user in your database. Instead you can just query the number of comments for each user (or even the top n commenters) with something like:

from django.db.models import Count

total_comments = Comment.objects.count()
# Fetch top 10 commenters, annotated with number of comments, ordered by highest first
User.objects.annotate(num_comments=Count('comment')).order_by('-num_comments')[:10]
for user in users:
    percentage = user.num_comments / total_comments

This example assumes you have a User model that your Comment has a foreign key to.

The percentage of total comments doesn't actually matter if you are comparing relative numbers of comments.

Upvotes: 1

Related Questions