wrdeman
wrdeman

Reputation: 820

Cumulative (running) sum with django orm and postgresql

Is it possible to calculate the cumulative (running) sum using django's orm? Consider the following model:

class AModel(models.Model):
    a_number = models.IntegerField()

with a set of data where a_number = 1. Such that I have a number ( >1 ) of AModel instances in the database all with a_number=1. I'd like to be able to return the following:

AModel.objects.annotate(cumsum=??).values('id', 'cumsum').order_by('id')
>>> ({id: 1, cumsum: 1}, {id: 2, cumsum: 2}, ... {id: N, cumsum: N})

Ideally I'd like to be able to limit/filter the cumulative sum. So in the above case I'd like to limit the result to cumsum <= 2

I believe that in postgresql one can achieve a cumulative sum using window functions. How is this translated to the ORM?

Upvotes: 12

Views: 5395

Answers (5)

Campi
Campi

Reputation: 2307

For reference, starting with Django 2.0 it is possible to use the Window function to achieve this result:

AModel.objects.annotate(cumsum=Window(Sum('a_number'), order_by=F('id').asc()))\
              .values('id', 'cumsum').order_by('id', 'cumsum')

Upvotes: 18

Scott Skiles
Scott Skiles

Reputation: 3847

For posterity, I found this to be a good solution for me. I didn't need the result to be a QuerySet, so I could afford to do this, since I was just going to plot the data using D3.js:

import numpy as np
import datettime

today = datetime.datetime.date()

raw_data = MyModel.objects.filter('date'=today).values_list('a_number', flat=True)

cumsum = np.cumsum(raw_data)

Upvotes: 2

wrdeman
wrdeman

Reputation: 820

From Dima Kudosh's answer and based on https://stackoverflow.com/a/5700744/2240489 I had to do the following: I removed the reference to PARTITION BY in the sql and replaced with ORDER BY resulting in.

AModel.objects.annotate(
    cumsum=Func(
        Sum('a_number'), 
        template='%(expressions)s OVER (ORDER BY %(order_by)s)', 
        order_by="id"
    ) 
).values('id', 'cumsum').order_by('id', 'cumsum')

This gives the following sql:

SELECT "amodel"."id",
SUM("amodel"."a_number") 
OVER (ORDER BY id) AS "cumsum" 
FROM "amodel" 
GROUP BY "amodel"."id" 
ORDER BY "amodel"."id" ASC, "cumsum" ASC

Dima Kudosh's answer was not summing the results but the above does.

Upvotes: 5

Dima  Kudosh
Dima Kudosh

Reputation: 7376

You can try to do this with Func expression.

from django.db.models import Func, Sum

AModel.objects.annotate(cumsum=Func(Sum('a_number'), template='%(expressions)s OVER (PARTITION BY %(partition_by)s)', partition_by='id')).values('id', 'cumsum').order_by('id')

Upvotes: 1

itzMEonTV
itzMEonTV

Reputation: 20339

Check this

AModel.objects.order_by("id").extra(select={"cumsum":'SELECT SUM(m.a_number) FROM table_name m WHERE m.id <= table_name.id'}).values('id', 'cumsum')

where table_name should be the name of table in database.

Upvotes: 0

Related Questions