Reputation: 311
Say I have these three models:
class User(models.Model):
....
class Tweet(models.Model):
created_at = models.DateTimeField(blank=True, null=True)
user = models.ForeignKey(
"core.User", related_name="tweets", on_delete=models.SET_NULL, null=True
)
class Tag(models.Model):
tag = models.CharField(max_length=100, blank=False, unique=True)
tweets = models.ManyToManyField(Tweet, related_name="calls")
The query I want to build is 'Tags, ordered by the number of unique tweet users which made tweets in a particular time period'. I had built a custom counting query to achieve this, it worked, but it was slow. I've now arranged the DB as above, with the tags as a separate model related to Tweets.
tags = Tag.objects.filter(tweets__created_at__week=date.isocalendar()[1]).annotate(count=Count('tweets__user')).filter(count__gt=1).order_by('-count', 'asset').distinct()
The issue is, the Count('tweets__user')
part of the query effectively counts the number of tweets associated with the tag. The tweets can (and are) often from the same account, I want the number of unique twitter user accounts. Is there a way to build this query, with the data modelled in this way, using Django only?
Upvotes: 0
Views: 101
Reputation: 21787
Count('tweets__user')
part of the query effectively counts the number of tweets associated with the tag.
That happens because tweets
is an m2m and writing that part makes a join, hence counting all tweets. To solve this you need to specify the distinct
keyword argument on the call to Count
[Django docs]:
tags = Tag.objects.filter(
tweets__created_at__week=date.isocalendar()[1]
).annotate(
count=Count('tweets__user', distinct=True)
).filter(count__gt=1).order_by('-count', 'asset').distinct()
Upvotes: 1