Reputation: 23098
I'd like to know if it's possible to use window queries based on time in Postgresql.
The original data is in the first three columns (date, salesman, amount):
date | salesman | amount | sum-3-rolling-days |
---|---|---|---|
2020-01-01 | john | 10 | 10 |
2020-01-02 | john | 15 | 25 |
2020-01-03 | john | 8 | 33 |
2020-01-04 | john | 12 | 35 |
2020-01-05 | john | 11 | 31 |
2020-01-01 | daniel | 5 | 5 |
2020-01-02 | daniel | 6 | 11 |
2020-01-03 | daniel | 7 | 18 |
2020-01-04 | daniel | 8 | 21 |
2020-01-05 | daniel | 9 | 24 |
The fourth column represents the total amount by this salesman during the past three rolling days.
Pandas has built-in functions to do this, but I can't think of any way to do it in Postgresql using built-in sum() over ()
syntax. The only way I was able to is using a convoluted mix of lateral joins and subqueries with conditions on time delta comparisons, which is inelegant to say the least.
Pandas' way (by memory, exact syntax might differ slightly) — cannot get any terser:
df.groupby('salesman').rolling('3d').sum()
Upvotes: 2
Views: 3926
Reputation: 7862
You can use CTE (common table expression). Try with
with data as (
select
date,
salesman,
amount
from
foo
group by
date,
salesman
order by
date,
salesman,
)
select
*,
sum(amount) over (
partition by salesman order by date,
salesman rows between unbounded preceding and current row
) as rolling
from data;
Upvotes: 0
Reputation: 23766
SELECT
*,
SUM(amount) OVER (
PARTITION BY salesman -- 1
ORDER BY "date" -- 2
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW -- 3
)
FROM mytable
salesman
. So, the following will only done within theses partitionsdate
columnIf you are using Postgres 11 or later, you can define the window even more precisely using RANGE
with date intervals instead of counting ROWS
:
SELECT
*,
SUM(amount) OVER (
PARTITION BY salesman
ORDER BY "date"
RANGE BETWEEN interval '2 days' PRECEDING AND CURRENT ROW
)
FROM mytable
Upvotes: 9