Kate Tres
Kate Tres

Reputation: 93

Union of arrays as aggregate function

I have the following input:

name  | count | options
-----------------------
user1 | 3     | ['option1', 'option2']
user1 | 12    | ['option2', 'option3']
user2 | 2     | ['option1', 'option3']
user2 | 1     | []

I want the following output:

name  | count | options
-----------------------
user1 | 12    | ['option1', 'option2', 'option3']
user2 | 2     | ['option1', 'option3']

I am grouping by name. For each group, the count should be aggregated as the max and the options should be aggregated as the union. I am having troubles figuring out how do the the latter.

Currently, I have this query:

with data(name, count, options) as (
    select 'user1', 12, array['option1', 'option2']::text[]
    union all
    select 'user1', 12, array['option2', 'option3']::text[]
    union all
    select 'user2', 2, array['option1', 'option3']::text[]
    union all
    select 'user2', 1, array[]::text[]
)
select name, max(count)
from data
group by name

http://rextester.com/YTZ45626

I know this can be easily done by defining a custom aggregate function, but I want to do this via a query. I understand the basics of unnest() the array (and array_agg() the results later on), but cannot figure out how to inject this in my query.

Upvotes: 9

Views: 4418

Answers (2)

Jan Katins
Jan Katins

Reputation: 2319

If you cannot do unnest (e.g. because you need to sum up other columns), this is a lightly tested aggregation function which concats each individual array:

CREATE OR REPLACE FUNCTION concat_array_agg_sfunc(state ANYARRAY, v ANYARRAY)
  RETURNS ANYARRAY AS
$$
SELECT array_cat(state, v);
$$ LANGUAGE SQL IMMUTABLE;

DROP AGGREGATE IF EXISTS concat_array_agg( ANYARRAY );

CREATE AGGREGATE concat_array_agg( ANYARRAY ) (
  SFUNC = concat_array_agg_sfunc,
  STYPE = ANYARRAY
  );
SELECT
  k,
  concat_array_agg(v)
FROM (VALUES ('a', ARRAY [1,2,3]),
             ('a', ARRAY [4,5,6]),
             ('b', ARRAY [1,2,3]),
             ('b', ARRAY [2,3,4]),
             ('c', ARRAY [2,3,4]),
             ('c', NULL)) t(k, v)
GROUP BY k

Upvotes: 0

Marth
Marth

Reputation: 24832

You can use an implicit lateral join using unnest(options) in the FROM list, and then using array_agg(distinct v) to create an array with the options:

with data(name, count, options) as (
    select 'user1', 12, array['option1', 'option2']::text[]
    union all
    select 'user1', 12, array['option2', 'option3']::text[]
    union all
    select 'user2', 2, array['option1', 'option3']::text[]
    union all
    select 'user2', 1, array[]::text[]
)
select name, array_agg(distinct v)  -- the 'v' here refers to the 'f(v)' alias below
from data, unnest(options) f(v)
group by name;
┌───────┬───────────────────────────┐
│ name  │         array_agg         │
├───────┼───────────────────────────┤
│ user1 │ {option1,option2,option3} │
│ user2 │ {option1,option3}         │
└───────┴───────────────────────────┘
(2 rows)

Upvotes: 4

Related Questions