phil0s0pher
phil0s0pher

Reputation: 311

How to count the number of unique values on query's related model in Django?

Say I have these three models:

class User(models.Model):
    ....

class Tweet(models.Model):
    created_at = models.DateTimeField(blank=True, null=True)
    user = models.ForeignKey(
        "core.User", related_name="tweets", on_delete=models.SET_NULL, null=True
    )


class Tag(models.Model):
    tag = models.CharField(max_length=100, blank=False, unique=True)
    tweets = models.ManyToManyField(Tweet, related_name="calls")

The query I want to build is 'Tags, ordered by the number of unique tweet users which made tweets in a particular time period'. I had built a custom counting query to achieve this, it worked, but it was slow. I've now arranged the DB as above, with the tags as a separate model related to Tweets.

tags = Tag.objects.filter(tweets__created_at__week=date.isocalendar()[1]).annotate(count=Count('tweets__user')).filter(count__gt=1).order_by('-count', 'asset').distinct()

The issue is, the Count('tweets__user') part of the query effectively counts the number of tweets associated with the tag. The tweets can (and are) often from the same account, I want the number of unique twitter user accounts. Is there a way to build this query, with the data modelled in this way, using Django only?

Upvotes: 0

Views: 101

Answers (1)

Abdul Aziz Barkat
Abdul Aziz Barkat

Reputation: 21787

Count('tweets__user') part of the query effectively counts the number of tweets associated with the tag.

That happens because tweets is an m2m and writing that part makes a join, hence counting all tweets. To solve this you need to specify the distinct keyword argument on the call to Count [Django docs]:

tags = Tag.objects.filter(
    tweets__created_at__week=date.isocalendar()[1]
).annotate(
    count=Count('tweets__user', distinct=True)
).filter(count__gt=1).order_by('-count', 'asset').distinct()

Upvotes: 1

Related Questions