Ivan Peng
Ivan Peng

Reputation: 609

Django: Annotate sum on variable number of columns

So I've read the annotate columns, and utilizing the F() functions, and I saw this post on how to sum multiple columns. However, I'm working with an EAV style DB, so I'm given a variable list of columns. Consider the example:

class TestModel(models.Model):
    column_a = models.FloatField()
    column_b = models.FloatField()
    column_c = models.FloatField()
    column_d = models.FloatField()

ATTEMPT 1:

columns = {'column_a', 'column_c', 'column_d'}
queryset = DummyModel.objects.annotate(total=Sum([F(column_name) for column_name in columns]))

However, a print(queryset.query) yields the error django.core.exceptions.FieldError: Cannot resolve expression type, unknown output_field

ATTEMPT 2:

queryset = DummyModel.objects.annotate(total=ExpressionWrapper(Sum([F(column_name) for column_name in columns]), output_field=FloatField())) 

This does not yield a compilation error, but the SQL query yields:

SELECT "test_model"."column_a", "test_model"."column_c", "test_model"."column_d", SUM([]) AS "total" FROM "test_model"

Which is empty. Does anyone have any idea of how to solve this? Any help is greatly appreciated!

Upvotes: 3

Views: 3696

Answers (1)

jacobian
jacobian

Reputation: 4728

To figure this out, it helps to picture the table first:

column_a | column b | column_c
---------+----------+----------
1        | 2        | 3
4        | 5        | 6
7        | 8        | 9

Sum is a "vertical" operation; that is, if we wanted the sum of column_a, we could do

>>> DummyModel.objects.aggregate(total=Sum('column_a'))
{'total': 12}

As you can see, this returns 1 + 4 + 7 == 12 -- so you can see why I call it a "vertical" sum. Notice we use aggregate rather than annotate: aggregate is for vertical operators.

If instead we want "horizontal" sum -- the total across a row -- we'd use F() and +. So to get column_a + column_b in each row, we'd use

>>> DummyModel.objects.annotate(total=F('column_a') + F('column_b')).values('total')
<QuerySet [{'total': 3}, {'total': 9}, {'total': 15}]>

Hopefully you can see why I call this a "horizontal" sum: we're getting a sum of a and b "horizontally" in each row. And now notice we use annotate, which is for horizontal operations.

If the names of the columns aren't known beforehand, you'd need to get tricky and use functools.reduce and operator.add to build up an expression:

>>> from functools import reduce
>>> from operator import add

>>> cols = ['column_b', 'column_c']
>>> expr = reduce(add, (F(col) for col in cols))
>>> DummyModel.objects.annotate(total=expr).values('total')
<QuerySet [{'total': 5}, {'total': 11}, {'total': 17}]>

If we want both a horizontal and a vertical sum -- i.e. the sum of column_a plus the sum of column_b -- we need to use Sum and F():

>>> DummyModel.objects.aggregate(total=Sum(F('column_a') + F('column_b')))
{'total': 27}

Notice: aggregate rather than annotate, since we're ultimately going with a vertical operation; a Sum of rows. Yes, there's a horizontal operation first, but since we ultimately Sum, we need aggregate.

So, to wrap things up, if the fields are a variable, we need to combine aggregate, Sum, and the reduce trickery from above:

>>> cols = ['column_b', 'column_c']
>>> expr = reduce(add, (F(col) for col in cols))
>>> DummyModel.objects.aggregate(total=Sum(expr))
{'total': 33}

Hope this helps!

Upvotes: 6

Related Questions