Ben
Ben

Reputation: 835

cumulative multiply in postgres

I want to do a cumulative multiplication in Postgres, but cannot find any references to this.

my table has a sequence number and a value field.

| seq | value |
| 0   | 1     |
| 1   | 3     |
| 2   | 5     |
| 3   | 7     |

and i want to generate return a new column which is 'value' for the first row and 'value' multiplied by the previous rows result for all other entries

| seq | value | result |
| 0   | 1     | 1      |
| 1   | 3     | 3      |
| 2   | 5     | 15     |
| 3   | 7     | 105    |

The function i want to create is a bit like a running total, which points me back to the sum() function rather than creating the function myself using multiplication operator.

I will also update the function from the simple val*(prev_result) in future versions of the query so dont want to build a 'function' to do this.

Do i need to look at doing a join of the table back onto itself using a 'recursive with' using a 'common table expression' ?

Will this be the most efficient method or should i be using windowing inside a CTE using a combination of 'lag over' ?

Any help or example fiddle examples would be appreciated.

Upvotes: 2

Views: 3816

Answers (2)

Craig Ringer
Craig Ringer

Reputation: 324731

Any aggregate function can be used as a window function to get a running aggregation. PostgreSQL lets you define custom aggregates from SQL.

So one way is to define a cumulative multiplication aggregate, using the underlying function that implements the multiplication operator for a type. e.g. for int8:

CREATE AGGREGATE cumulative_mul(int8) (SFUNC = int8mul, STYPE = int8);

You can then use this in a window function expression, e.g.

SELECT 
  seq,
  "value",
  cumulative_mul("value") OVER (ORDER BY seq)
FROM t;

See SQLFiddle.

Result:

test=> SELECT 
  seq,
  "value",
  cumulative_mul("value") OVER (ORDER BY seq)
FROM t;
 seq | value | cumulative_mul 
-----+-------+----------------
   0 |     1 |              1
   1 |     3 |              3
   2 |     5 |             15
   3 |     7 |            105
(4 rows)

... and it should be pretty fast too, especially if there's an index on (seq, value), allowing PostgreSQL to do an index-only scan:

test=> explain SELECT 
  seq,
  "value",
  cumulative_mul("value") OVER (ORDER BY seq)
FROM t;
                                QUERY PLAN                                 
---------------------------------------------------------------------------
 WindowAgg  (cost=0.13..8.26 rows=4 width=8)
   ->  Index Only Scan using seqval on t  (cost=0.13..8.19 rows=4 width=8)
(2 rows)

Upvotes: 5

jpw
jpw

Reputation: 44891

Using a recursive common table expression should work, but it might not scale well for larger data sets:

with recursive c(seq, value) as (
    select seq, value from t
    union all
    select t.seq, c.value * t.value
    from t join c on t.seq = c.seq + 1
)

select seq, max(value) as value 
from c
group by seq
order by seq;

Sample SQL Fiddle

Upvotes: 1

Related Questions