Reputation: 2390
What is the best way to perform a cumulative sum over a table in Postgres, in a way that can bring the best performance and flexibility in case more fields / columns are added to the table.
Table
a b d 1 59 15 181 2 16 268 3 219 4 102
Cumulative
a b d 1 59 15 181 2 31 449 3 668 4 770
Upvotes: 2
Views: 1413
Reputation: 656231
I think what you are really looking for is this:
SELECT id
, sum(a) OVER (PARTITION BY a_grp ORDER BY id) as a
, sum(b) OVER (PARTITION BY b_grp ORDER BY id) as b
, sum(d) OVER (PARTITION BY d_grp ORDER BY id) as d
FROM (
SELECT *
, count(a IS NULL OR NULL) OVER (ORDER BY id) as a_grp
, count(b IS NULL OR NULL) OVER (ORDER BY id) as b_grp
, count(d IS NULL OR NULL) OVER (ORDER BY id) as d_grp
FROM tbl
) sub
ORDER BY id;
The expression count(col IS NULL OR NULL) OVER (ORDER BY id)
forms groups of consecutive non-null rows for a
, b
and d
in the subquery sub
.
In the outer query we run cumulative sums per group. NULL
values form their own group and stay NULL
automatically. No additional CASE
statement necessary.
SQL Fiddle (with some added values for column a
to demonstrate the effect).
Upvotes: 3
Reputation: 42095
Window functions for running sum.
SELECT sum(a) OVER (ORDER BY d) as "a",
sum(b) OVER (ORDER BY d) as "b",
sum(d) OVER (ORDER BY d) as "d"
FROM table;
If you have more than one running sum, make sure the orders are the same.
It's important to note that if you want your columns to appear as the aggregate table in your question (each field uniquely ordered), it'd be a little more involved.
Update: I've modified the query to do the required sorting, without a given common field.
WITH
rcd AS (
select row_number() OVER() as num,a,b,d
from tbl
),
sorted_a AS (
select row_number() OVER(w1) as num, sum(a) over(w2) a
from tbl
window w1 as (order by a nulls last),
w2 as (order by a nulls first)
),
sorted_b AS (
select row_number() OVER(w1) as num, sum(b) over(w2) b
from tbl
window w1 as (order by b nulls last),
w2 as (order by b nulls first)
),
sorted_d AS (
select row_number() OVER(w1) as num, sum(d) over(w2) d
from tbl
window w1 as (order by d nulls last),
w2 as (order by d nulls first)
)
SELECT sorted_a.a, sorted_b.b, sorted_d.d
FROM rcd
JOIN sorted_a USING(num)
JOIN sorted_b USING(num)
JOIN sorted_d USING(num)
ORDER BY num;
Upvotes: 3
Reputation: 1269445
You can use window functions, but you need additional logic to avoid values where there are NULL
s:
SELECT id,
(case when a is not null then sum(a) OVER (ORDER BY id) end) as a,
(case when b is not null then sum(b) OVER (ORDER BY id) end) as b,
(case when d is not null then sum(d) OVER (ORDER BY id) end) as d
FROM table;
This assumes that the first column that specifies the ordering is called id
.
Upvotes: 3