whitebear
whitebear

Reputation: 12461

Duplicate records are fetched in filtered queriset

I have a little bit complex combination of | & filter like this.

objs = objs.annotate(num_line=Count("lns")).\
    filter(Q(lns__id__in=lnIds) | (Q(sts__id__in=oIds) 
        & (Q(lns__id__in=lnIds) | Q(num_ln__lte=0))))

It looks like work, but results are sometimes deplicated (has same id).

   id
    1
    2
    3
    3
    4
    5

I thought filter is assured to return unique id though ,am I wrong??

Or is it possible to omit duplicated rows??

class Obj(models.Model):
    created_at = models.DateTimeField(null=True)
    lns = models.ManyToManyField(Ln)
    sts = models.ManyToManyField(St)
    is = models.ManyToManyField(Is)
    pub_date = models.DateTimeField('date published')

Upvotes: 1

Views: 28

Answers (1)

willeM_ Van Onsem
willeM_ Van Onsem

Reputation: 477704

No, if you filter in one-to-many or many-to-many relations, it will create a JOIN with the related table, and if multiple objects match, then this will result in yielding the same value multiple times.

You can make use of .distinct(..) (Django-doc) to filter the duplicate rows.

objs = objs.filter(
    Q(lns__id__in=lnIds) |
    (Q(sts__id__in=oIds) & (Q(lns__id__in=lnIds) | Q(num_ln__lte=0))
).distinct()

What is even more problematic is that if you add an annotation, then this can result in the fact that the count will count duplicates as well. You can add distinct=True [Django-doc] to the Count(..) expression to prevent that:

objs = objs.annotate(
    num_line=Count('lns', distinct=True)
).filter(
    Q(lns__id__in=lnIds) |
    (Q(sts__id__in=oIds) & (Q(lns__id__in=lnIds) | Q(num_ln__lte=0))
).distinct()

Upvotes: 1

Related Questions