Reputation: 303
Suppose we have (in PostgreSQL 9.1) a table with some identifier, a column of type integer[] and some other columns (at least one, although there might be more) of type integer (or any other which can be summed).
The goal is to have an aggregate giving for each identifier sum of the "summable" column and an array of all distinct elements of the array column.
The only way I can find is to use unnest function on the array column in a subquery and than join it with another subquery aggregating the "summable" columns.
A simple example is as follows:
CREATE TEMP TABLE a (id integer, aint integer[], summable_val integer);
INSERT INTO a VALUES
(1, array[1,2,3], 5),
(2, array[2,3,4], 6),
(3, array[3,4,5], 2),
(1, array[7,8,9], 19);
WITH u AS (
SELECT id, unnest(aint) as t FROM a GROUP BY 1,2
),
d AS (
SELECT id, array_agg(distinct t) ar FROM u GROUP BY 1),
v as (
SELECT id, sum(summable_val) AS val
FROM a GROUP BY 1
)
SELECT v.id, v.val, d.ar
FROM v
JOIN d
ON v.id = d.id;
The code above does what I intended but the question is can we do any better? Main drawback of this solution is that it reads and aggregate table twice which might be troublesome for larger tables.
Some other solution to the general problem is to avoid using the array column and agregate "summable" column for each array member and then use array_agg
in aggregation - but at least for now I'd like to stick to this array way.
Thanks in advance for any ideas.
Upvotes: 8
Views: 17726
Reputation: 121604
The query may be a little bit faster (I suppose) but I cannot see any remarkable optimizations:
select a.id, sum(summable_val) val, ar
from
(select id, array_agg(distinct t) ar
from
(select id, unnest(aint) as t from a group by 1,2) u
group by 1) x
join a on x.id = a.id
group by 1,3
Upvotes: 6