clubkli
clubkli

Reputation: 257

Query to sum over multiple rows based on column

I have a table with monthly amounts per ID, where in some of these months an invoice takes place.

My table looks like:

ID Date Invoiced Amount
AAA 2023-01 0 10
AAA 2023-02 0 15
AAA 2023-03 1 15
AAA 2023-04 0 10
AAA 2023-05 0 10
AAA 2023-06 1 10
BBB 2022-05 0 40
BBB 2022-06 1 20
BBB 2022-07 0 30
BBB 2022-08 1 30

I need to have the rows only per ID with the invoice months, where we sum over the previous amounts since the last invoice. What I want to have:

ID Date Invoiced Amount
AAA 2023-03 1 40
AAA 2023-06 1 30
BBB 2022-06 1 60
BBB 2022-08 1 60

How this can be done in postgresql? I started with the query below, but it gives not yet what I need.

SELECT "ID", "Date", "Invoiced"
      , sum(Amount) OVER (PARTITION BY "Invoiced" ORDER BY "Id", "Date") AS Amount
FROM   Table

Upvotes: 3

Views: 110

Answers (3)

Zegarek
Zegarek

Reputation: 26123

For completeness, here's one with correlated scalar subqueries: demo

select "ID", "Date", "Invoiced",
  (select sum(t2."Amount")+t1."Amount" 
   from my_table t2
   where t2."ID"=t1."ID"
     and t2."Invoiced"=0
     and t2."Date"<=t1."Date"
     and t2."Date">=(select coalesce(max("Date"),'0000-00') 
                     from my_table t3
                     where t3."ID"=t1."ID"
                       and t3."Invoiced"=1 
                       and t3."Date"<t1."Date")
  ) as "Amount"
from my_table as t1
where "Invoiced"=1;
ID Date Invoiced Amount
AAA 2023-03 1 40
AAA 2023-06 1 30
BBB 2022-06 1 60
BBB 2022-08 1 60

To get more columns from the lookback, it would have to be moved down to FROM section as a LATERAL subquery - works the same, but added columns make it no longer scalar, so it can't remain in the SELECT list.

Here's a similar thread where you can find a few performance tests comparing the two, in the comments. Long story short, window functions are simply the right tool for this task - less typing, higher performance. Still, alternatives are always worth a mention.

Upvotes: 0

SelVazi
SelVazi

Reputation: 16043

You have a gaps and islands problem, you could use SUM() as window function in descending order, to give a unique id to each successive group of rows (0 values then 1) :

WITH cte AS (
  SELECT *, sum(invoiced) OVER (PARTITION BY ID ORDER BY Date desc) grp
  FROM mytable
  ORDER BY ID, Date
)
SELECT ID, MAX(date) AS Date, MAX(Invoiced) AS Invoiced, SUM(Amount) AS Amount
FROM cte
GROUP BY ID, grp
ORDER BY ID, Date

Upvotes: 3

Tim Biegeleisen
Tim Biegeleisen

Reputation: 521269

We can use a combination of LAG() and SUM() as window functions, to create a pseudo group for each invoice period, for each ID:

WITH cte1 AS (
    SELECT *, LAG(Invoiced, 1, 0) OVER (PARTITION BY ID ORDER BY Date) LagInvoiced
    FROM yourTable
),
cte2 AS (
    SELECT *, SUM(LagInvoiced) OVER (PARTITION BY ID ORDER BY Date) grp
    FROM cte1
)

SELECT
    ID,
    MAX(CASE WHEN Invoiced = 1 THEN Date END) AS InvDate,
    1 AS Invoiced,
    SUM(Amount)
FROM cte2
GROUP BY
    ID,
    grp
ORDER BY
    ID,
    InvDate;

screen capture from demo link below

Demo

Upvotes: 1

Related Questions