Django annotate returning duplicate entries

Question

I'm annotating a queryset like so:

class ItemQuerySet(models.QuerySet):
    def annotate_subitem_stats(self):
        return self.annotate(
            count_subitems=Count('subitems'),
            has_sent_subitems=Case(
                When(subitems__status=Status.sent, then=Value(True)),
                default=Value(False)
            ),
        )

In this example, SubItem is a model with a foreign key to Item.

A strange behaviour happens when I run this code. Suppose we have 1 Item and 2 SubItems linked to it. One subitem has status sent and the other doesn't. When I run annotate on the queryset, the queryset returns the item twice, one with has_sent_subitems set to True and the other set to False. The other strange thing is that, one duplicate has count_subitems == 1, and the other has count_subitems == 1, as if the queryset has split item into two rows, one where status == 'sent' and the other where status != 'sent'.

This is basically what the annotated queryset looks like:

[
    {
        'name': 'Item Name',
        'count_subitems': 1,
        'has_sent_subitem': False
    },
    {
        'name': 'Item Name',
        'count_subitems': 1,
        'has_sent_subitem': True
    }
]

This is what the database looks like, using pseudocode:

item = Item()
SubItem(item=item, status=draft)
SubItem(item=item, status=sent)

I'm pretty sure this has to do with line When(subitems__status=Status.sent, then=Value(True)),. Is there any way I can make that line check if only 1 item has status sent, and then set the annotation to true and move on?

P.S. Using .distinct() didn't work. I can't use .distinct(field), because annotate() + distinct(fields) is not implemented.

Brian Destura · Accepted Answer

You can instead use Exists subquery for this, to avoid the join caused by subitems__status, so:

from django.db.models import Exists, OuterRef


class ItemQuerySet(models.QuerySet):
    def annotate_subitem_stats(self):
        return self.annotate(
            count_subitems=Count('subitems'),
            has_sent_subitems=Exists(
                SubItem.objects.filter(item=OuterRef('pk'), status=Status.sent)
            ),
        )

Django annotate returning duplicate entries

Answers (2)

Combining multiple aggregations

Related Questions