Reputation: 2270
I have a table with the an array column type:
title tags
"ridealong";"{comedy,other}"
"ridealong";"{comedy,tragedy}"
"freddyjason";"{horror,silliness}"
I would like to write a query that produces a single array per title(in an ideal world it would be a set/deduplicated array)
e.g.
select array_cat(tags),title from my_test group by title
The above query doesn't work of course, but I would like to produce 2 rows:
"ridealong";"{comedy,other,tragedy}"
"freddyjason";"{horror,silliness}"
Any help or pointers would be very much appreciated (I am using Postgres 9.1)
Based on Craig's help I ended up with the following (slightly altered syntax since 9.1 complains about the query exactly as he shows it)
SELECT t1.title, array_agg(DISTINCT tag.tag)
FROM my_test t1, (select unnest(tags) as tag,title from my_test) as tag
where tag.title=t1.title
GROUP BY t1.title;
Upvotes: 42
Views: 40122
Reputation: 1140
The above solutions have the disadvantage that they do not remove the duplicate values.
So, we end up having duplicate values.
This solution keeps only unique values.
Following the approach of custom aggregate:
CREATE OR REPLACE FUNCTION array_uniq_cat_agg(anyarray, anyarray)
RETURNS anyarray LANGUAGE SQL AS $$
SELECT ARRAY(
SELECT DISTINCT unnest($1 || $2)
)
$$;
CREATE AGGREGATE array_uniq_cat_agg(anyarray) (
SFUNC = array_uniq_cat_agg,
STYPE = anyarray,
INITCOND = '{}'
);
SELECT
title,
array_uniq_cat_agg(tags) AS unique_tags
FROM
my_test
GROUP BY
title;
Upvotes: 0
Reputation: 36214
The obvious solution would be the LATERAL
join (which also suggested by @CraigRinger), but that is added to PostgreSQL in 9.3.
In 9.1 you cannot avoid the sub-query, but you can simplify it:
SELECT title, array_agg(DISTINCT tag)
FROM (SELECT title, unnest(tags) FROM my_test) AS t(title, tag)
GROUP BY title;
Upvotes: 6
Reputation: 324265
Approach 1: define a custom aggregate. Here's one I wrote earlier.
CREATE TABLE my_test(title text, tags text[]);
INSERT INTO my_test(title, tags) VALUES
('ridealong', '{comedy,other}'),
('ridealong', '{comedy,tragedy}'),
('freddyjason', '{horror,silliness}');
CREATE AGGREGATE array_cat_agg(anyarray) (
SFUNC=array_cat,
STYPE=anyarray
);
select title, array_cat_agg(tags) from my_test group by title;
... or since you don't want to preserve order and want to deduplicate, you could use a LATERAL
query like:
SELECT title, array_agg(DISTINCT tag ORDER BY tag)
FROM my_test, unnest(tags) tag
GROUP BY title;
in which case you don't need the custom aggregate. This one is probably a fair bit slower for big data sets due to the deduplication. Removing the ORDER BY
if not required may help, though.
Upvotes: 41