SAB
SAB

Reputation: 309

Sum of cummulative difference within group in postgres

I have a table that looks like the one below. I will like to find the sum of the cumulative difference between the target value daily actual value.

ID  | Date    | Target_value | Daily_Value
1   |01/10/20 |   200        |   5
2   |01/10/20 |   500        |   2
3   |05/10/20 |   600        |   10
1   |04/11/20 |   200        |   50
2   |05/11/20 |   500        |   80
3   |05/11/20 |   600        |   40
1   |06/12/20 |   200        |   50
4   |06/12/20 |   400        |   30
5   |07/12/20 |   300        |   20

Expected output

Date     | Target_value - monthly_cummulative daily_value          |
   10/20 | (200 + 500 + 600) - (5 + 2 + 10) =   1283               |
   11/20 | (200 + 500 + 600) - (17 + 50 + 80 + 40) = 1113          |
   12/20 | (200 + 500 + 600 + 400 + 300) - (17 + 170 + 100) = 1713 |


This is similar to this Calculating Cumulative Sum in PostgreSQL but not exactly the same.

Upvotes: 2

Views: 174

Answers (3)

Anirban Jana
Anirban Jana

Reputation: 33

Unlike Postgres, AWS Redshift requires a frame clause when using "over (order by ...)".

The following is the updated version of LukStorms' answer.

SELECT 
  TO_CHAR(DATE_TRUNC('month', date), 'MM/YY') AS MonthYear
, SUM(Target_value) 
  - SUM(SUM(Daily_Value)) OVER (ORDER BY MonthYear ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS monthly_cumulative
FROM yourtable
GROUP BY MonthYear
ORDER BY MonthYear;

The frame clause added here is "ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW".

For conciseness, I have also replaced all but the first use of DATE_TRUNC by the alias MonthYear.

Upvotes: 0

LukStorms
LukStorms

Reputation: 29667

Group by the truncated dates.

Then Sum over the daily sum.

But the targets need to be handled seperatly.

WITH CTE_TARGETS AS (
  SELECT ID
  , MAX(Target_Value) AS Target_Value
  , MIN(DATE_TRUNC('month', Date)) as month_first
  FROM your_table
  GROUP BY ID
), CTE_MONTHLY AS
(
  SELECT 
    DATE_TRUNC('month', Date) AS month_first
  , SUM(SUM(Daily_Value)) OVER (ORDER BY DATE_TRUNC('month', Date)) AS month_daily
  FROM your_table t
  GROUP BY DATE_TRUNC('month', Date)
) 
SELECT 
  TO_CHAR(mon.month_first, 'MM/YY') AS Month
, SUM(Target_Value) - month_daily AS monthly_cummulative
FROM CTE_MONTHLY mon
JOIN CTE_TARGETS tgt ON tgt.month_first <= mon.month_first
GROUP BY mon.month_first, month_daily
ORDER BY mon.month_first
month monthly_cummulative
10/20 1283
11/20 1113
12/20 1713

db<>fiddle here

Upvotes: 0

Tim Biegeleisen
Tim Biegeleisen

Reputation: 521629

We can do this in two steps. First, aggregate by year and month and generate the sums for the target and daily values. Then, use SUM() as an analytic function with a rolling window across the entire intermediate table to generate the difference.

WITH cte AS (
    SELECT DATE_TRUNC('month', Date), SUM(Target_value) AS Target_value,
           SUM(Daily_Value) AS Daily_Value
    FROM yourTable
    GROUP BY 1
)

SELECT ym, Target_value,
       Target_Value - SUM(Daily_Value) OVER (ORDER BY ym) AS output
FROM cte
ORDER BY ym;

screenshot from demo link below

Demo

Upvotes: 1

Related Questions