Reputation: 257

Query to sum over multiple rows based on column

I have a table with monthly amounts per ID, where in some of these months an invoice takes place.

My table looks like:

ID	Date	Invoiced	Amount
AAA	2023-01	0	10
AAA	2023-02	0	15
AAA	2023-03	1	15
AAA	2023-04	0	10
AAA	2023-05	0	10
AAA	2023-06	1	10
BBB	2022-05	0	40
BBB	2022-06	1	20
BBB	2022-07	0	30
BBB	2022-08	1	30

I need to have the rows only per ID with the invoice months, where we sum over the previous amounts since the last invoice. What I want to have:

ID	Date	Invoiced	Amount
AAA	2023-03	1	40
AAA	2023-06	1	30
BBB	2022-06	1	60
BBB	2022-08	1	60

How this can be done in postgresql? I started with the query below, but it gives not yet what I need.

SELECT "ID", "Date", "Invoiced"
      , sum(Amount) OVER (PARTITION BY "Invoiced" ORDER BY "Id", "Date") AS Amount
FROM   Table

Upvotes: 3

Answers (3)

Zegarek

Reputation: 26123

For completeness, here's one with correlated scalar subqueries: demo

select "ID", "Date", "Invoiced",
  (select sum(t2."Amount")+t1."Amount" 
   from my_table t2
   where t2."ID"=t1."ID"
     and t2."Invoiced"=0
     and t2."Date"<=t1."Date"
     and t2."Date">=(select coalesce(max("Date"),'0000-00') 
                     from my_table t3
                     where t3."ID"=t1."ID"
                       and t3."Invoiced"=1 
                       and t3."Date"<t1."Date")
  ) as "Amount"
from my_table as t1
where "Invoiced"=1;

ID	Date	Invoiced	Amount
AAA	2023-03	1	40
AAA	2023-06	1	30
BBB	2022-06	1	60
BBB	2022-08	1	60

To get more columns from the lookback, it would have to be moved down to FROM section as a LATERAL subquery - works the same, but added columns make it no longer scalar, so it can't remain in the SELECT list.

Here's a similar thread where you can find a few performance tests comparing the two, in the comments. Long story short, window functions are simply the right tool for this task - less typing, higher performance. Still, alternatives are always worth a mention.

Upvotes: 0

SelVazi

Reputation: 16043

You have a gaps and islands problem, you could use SUM() as window function in descending order, to give a unique id to each successive group of rows (0 values then 1) :

WITH cte AS (
  SELECT *, sum(invoiced) OVER (PARTITION BY ID ORDER BY Date desc) grp
  FROM mytable
  ORDER BY ID, Date
)
SELECT ID, MAX(date) AS Date, MAX(Invoiced) AS Invoiced, SUM(Amount) AS Amount
FROM cte
GROUP BY ID, grp
ORDER BY ID, Date

Upvotes: 3

Tim Biegeleisen

Reputation: 521269

We can use a combination of LAG() and SUM() as window functions, to create a pseudo group for each invoice period, for each ID:

WITH cte1 AS (
    SELECT *, LAG(Invoiced, 1, 0) OVER (PARTITION BY ID ORDER BY Date) LagInvoiced
    FROM yourTable
),
cte2 AS (
    SELECT *, SUM(LagInvoiced) OVER (PARTITION BY ID ORDER BY Date) grp
    FROM cte1
)

SELECT
    ID,
    MAX(CASE WHEN Invoiced = 1 THEN Date END) AS InvDate,
    1 AS Invoiced,
    SUM(Amount)
FROM cte2
GROUP BY
    ID,
    grp
ORDER BY
    ID,
    InvDate;

Demo

Upvotes: 1

Query to sum over multiple rows based on column

Answers (3)

Demo

Related Questions