Reputation: 820
Is it possible to calculate the cumulative (running) sum using django's orm? Consider the following model:
class AModel(models.Model):
a_number = models.IntegerField()
with a set of data where a_number = 1
. Such that I have a number ( >1 ) of AModel
instances in the database all with a_number=1
. I'd like to be able to return the following:
AModel.objects.annotate(cumsum=??).values('id', 'cumsum').order_by('id')
>>> ({id: 1, cumsum: 1}, {id: 2, cumsum: 2}, ... {id: N, cumsum: N})
Ideally I'd like to be able to limit/filter the cumulative sum. So in the above case I'd like to limit the result to cumsum <= 2
I believe that in postgresql one can achieve a cumulative sum using window functions. How is this translated to the ORM?
Upvotes: 12
Views: 5395
Reputation: 2307
For reference, starting with Django 2.0 it is possible to use the Window
function to achieve this result:
AModel.objects.annotate(cumsum=Window(Sum('a_number'), order_by=F('id').asc()))\
.values('id', 'cumsum').order_by('id', 'cumsum')
Upvotes: 18
Reputation: 3847
For posterity, I found this to be a good solution for me. I didn't need the result to be a QuerySet, so I could afford to do this, since I was just going to plot the data using D3.js:
import numpy as np
import datettime
today = datetime.datetime.date()
raw_data = MyModel.objects.filter('date'=today).values_list('a_number', flat=True)
cumsum = np.cumsum(raw_data)
Upvotes: 2
Reputation: 820
From Dima Kudosh's answer and based on https://stackoverflow.com/a/5700744/2240489 I had to do the following:
I removed the reference to PARTITION BY
in the sql and replaced with ORDER BY
resulting in.
AModel.objects.annotate(
cumsum=Func(
Sum('a_number'),
template='%(expressions)s OVER (ORDER BY %(order_by)s)',
order_by="id"
)
).values('id', 'cumsum').order_by('id', 'cumsum')
This gives the following sql:
SELECT "amodel"."id",
SUM("amodel"."a_number")
OVER (ORDER BY id) AS "cumsum"
FROM "amodel"
GROUP BY "amodel"."id"
ORDER BY "amodel"."id" ASC, "cumsum" ASC
Dima Kudosh's answer was not summing the results but the above does.
Upvotes: 5
Reputation: 7376
You can try to do this with Func expression.
from django.db.models import Func, Sum
AModel.objects.annotate(cumsum=Func(Sum('a_number'), template='%(expressions)s OVER (PARTITION BY %(partition_by)s)', partition_by='id')).values('id', 'cumsum').order_by('id')
Upvotes: 1
Reputation: 20339
Check this
AModel.objects.order_by("id").extra(select={"cumsum":'SELECT SUM(m.a_number) FROM table_name m WHERE m.id <= table_name.id'}).values('id', 'cumsum')
where table_name
should be the name of table in database.
Upvotes: 0