Reputation: 15824
In a Django app, I have a queryset for a data model called Comment
. This contains text comments left by users.
Imagine 'n' users commented. What's the fastest way to calculate what % of comments were left by which user?
Right now I'm thinking it's going to be:
Comment.objects.filter(commenter=<username>).count()/Comment.objects.count()
What do you suggest? My objective is to flag people who're commenting too much, in order to screen their accounts for possible spamming. I'd be running this query voluminously, hence the focus on performance.
Upvotes: 0
Views: 1326
Reputation: 31404
You should avoid making one query for each user in your database. Instead you can just query the number of comments for each user (or even the top n
commenters) with something like:
from django.db.models import Count
total_comments = Comment.objects.count()
# Fetch top 10 commenters, annotated with number of comments, ordered by highest first
User.objects.annotate(num_comments=Count('comment')).order_by('-num_comments')[:10]
for user in users:
percentage = user.num_comments / total_comments
This example assumes you have a User
model that your Comment
has a foreign key to.
The percentage of total comments doesn't actually matter if you are comparing relative numbers of comments.
Upvotes: 1