Time-window rolling sum in Postgresql

Question

I'd like to know if it's possible to use window queries based on time in Postgresql.

The original data is in the first three columns (date, salesman, amount):

date	salesman	amount	sum-3-rolling-days
2020-01-01	john	10	10
2020-01-02	john	15	25
2020-01-03	john	8	33
2020-01-04	john	12	35
2020-01-05	john	11	31
2020-01-01	daniel	5	5
2020-01-02	daniel	6	11
2020-01-03	daniel	7	18
2020-01-04	daniel	8	21
2020-01-05	daniel	9	24

The fourth column represents the total amount by this salesman during the past three rolling days.

Pandas has built-in functions to do this, but I can't think of any way to do it in Postgresql using built-in sum() over () syntax. The only way I was able to is using a convoluted mix of lateral joins and subqueries with conditions on time delta comparisons, which is inelegant to say the least.

Pandas' way (by memory, exact syntax might differ slightly) — cannot get any terser:

df.groupby('salesman').rolling('3d').sum()

S-Man · Accepted Answer

demos:db<>fiddle

SELECT
    *,
    SUM(amount) OVER (
        PARTITION BY salesman                     -- 1
        ORDER BY "date"                           -- 2
        ROWS BETWEEN 2 PRECEDING AND CURRENT ROW  -- 3
    )
FROM mytable

Build groups/partitions by the salesman. So, the following will only done within theses partitions
Order them by the date column
Sum only those records which are between 2 date fore the current one and the current one. This is the rolling part

If you are using Postgres 11 or later, you can define the window even more precisely using RANGE with date intervals instead of counting ROWS:

SELECT
    *,
    SUM(amount) OVER (
        PARTITION BY salesman                    
        ORDER BY "date"                          
        RANGE BETWEEN interval '2 days' PRECEDING AND CURRENT ROW
    )
FROM mytable

Time-window rolling sum in Postgresql

Answers (2)

Related Questions