Happy Ahmad
Happy Ahmad

Reputation: 1087

Wrong results when using two annotate expressions concurrently

Assume the following tables:

class Table1(models.Model): 
    Column1 = models.IntegerField()
class Table2(models.Model):  
    Column2 = models.IntegerField()
class Table3(models.Model):  
    Table1 = models.ForeignKey(Table1, null=False, on_delete=models.CASCADE)
    Table2 = models.ForeignKey(Table2, null=False, on_delete=models.CASCADE)
    Column3 = models.IntegerField()
class Table4(models.Model):  
    Table1 = models.ForeignKey(Table1, null=False, on_delete=models.CASCADE)
    Table3 = models.ForeignKey(Table3, null=False, on_delete=models.CASCADE)
    Column4 = models.IntegerField()

This annotate expression returns right answer:

print(Table1.objects.annotate(Exp1=Sum(
            Case(
                When(table3__Table2__Column2__in=[2, 3],
                     then=F('table3__Column3')),
                default=Value(0)
            ),
    )).values('Exp1'))

That is:

<QuerySet [{'Exp1': 96}]>

And I need to define another annotate expression as below:

print(Table1.objects.annotate(Exp2=Sum(
            Case(
                When(table4__Table3__Table2__Column2=3,
                     then=F('table4__Column4')),
                default=Value(0)
            ),
      )).values('Exp2'))

Again the result is correct:

<QuerySet [{'Exp2': 0}]>

Finally, I want to combine these two in one command:

print(Table1.objects.annotate(Exp1=Sum(
            Case(
                When(table3__Table2__Column2__in=[2, 3],
                     then=F('table3__Column3')),
                default=Value(0)
            ),
    ), Exp2=Sum(
            Case(
                When(table4__Table3__Table2__Column2=3,
                     then=F('table4__Column4')),
                default=Value(0)
            ),
    )).values('Exp1', 'Exp2'))

But unfortunately the result is not correct:

<QuerySet [{'Exp2': 0, 'Exp1': 480}]>

Upvotes: 3

Views: 638

Answers (2)

Happy Ahmad
Happy Ahmad

Reputation: 1087

The bug is reported here but it's not solved yet even in Django 1.11. The issue is related to joining two tables in reverse relations. Notice that distinct parameter works well for Count but not for Sum. So I used a trick and wrote the below ORM that works pretty well:

print(Table1.objects.annotate(
    temp_Exp1=Sum(
            Case(
                When(table3__Table2__Column2__in=[2, 3],
                     then=F('table3__Column3')),
                default=Value(0)
            ),), 
    temp_Exp2=Sum(
            Case(
                When(table4__Table3__Table2__Column2=3,
                     then=F('table4__Column4')),
                default=Value(0)
            ),),
    table3_count=Count('table3'),
    table3_count_distinct=Count('table3', distinct=True),
    table4_count=Count('table4'),
    table4_count_distinct=Count('table4', distinct=True),
).annotate(
    Exp1=F('temp_Exp1')*F('table3_count_distinct')/F('table3_count'),
    Exp2=F('temp_Exp2')*F('table4_count_distinct')/F('table4_count'),
).values('Exp1', 'Exp2'))

Upvotes: 0

Moses Koledoye
Moses Koledoye

Reputation: 78556

There's a caveat when using multiple aggregations with annotate:

Combining multiple aggregations with annotate() will yield the wrong results because joins are used instead of subqueries. For most aggregates, there is no way to avoid this problem...

And unlike aggregation with Count, Sum does not take a distinct parameter that could help dedupe items. I guess you should keep the distinct queries if you are to have correct results.

See Combining multiple aggregations.

Upvotes: 2

Related Questions