Katrine Bers
Katrine Bers

Reputation: 125

Array intersection as aggregate function for group by

I have the following table:

CREATE TABLE person
AS
  SELECT name, preferences
  FROM ( VALUES
    ( 'John', ARRAY['pizza', 'meat'] ),
    ( 'John', ARRAY['pizza', 'spaghetti'] ),
    ( 'Bill', ARRAY['lettuce', 'pizza'] ),
    ( 'Bill', ARRAY['tomatoes'] )
  ) AS t(name, preferences);

I want to group by person with intersect(preferences) as aggregate function. So I want the following output:

person | preferences
-------------------------------
John   | ['pizza']
Bill   | []

How should this be done in SQL? I guess I need to do something like the following, but what does the X function look like?

SELECT    person.name, array_agg(X)
FROM      person
LEFT JOIN unnest(preferences) preferences
ON        true
GROUP BY  name

Upvotes: 5

Views: 2126

Answers (3)

Evan Carroll
Evan Carroll

Reputation: 1

Using FILTER with ARRAY_AGG

SELECT name, array_agg(pref) FILTER (WHERE namepref = total)
FROM (
  SELECT name, pref, t1.count AS total, count(*) AS namepref
  FROM (
    SELECT name, preferences, count(*) OVER (PARTITION BY name)
    FROM person
  ) AS t1
  CROSS JOIN LATERAL unnest(preferences) AS pref
  GROUP BY name, total, pref
) AS t2
GROUP BY name;

Here is one way to do it using the ARRAY constructor and DISTINCT.

WITH t AS (
  SELECT name, pref, t1.count AS total, count(*) AS namepref
  FROM (
    SELECT name, preferences, count(*) OVER (PARTITION BY name)
    FROM person
  ) AS t1
  CROSS JOIN LATERAL unnest(preferences) AS pref
  GROUP BY name, total, pref
)
SELECT DISTINCT
  name,
  ARRAY(SELECT pref FROM t AS t2 WHERE total=namepref AND t.name = t2.name)
FROM t;

Upvotes: 2

pozs
pozs

Reputation: 36244

If writing a custom aggregate (like @LaurenzAlbe provided) is not an option for you, you can usually enroll the same logic in a recursive CTE:

with recursive cte(name, pref_intersect, pref_prev, iteration) as (
    select   name,
             min(preferences),
             min(preferences),
             0
    from     your_table
    group by name
  union all
    select   name,
             array(select e from unnest(pref_intersect) e
                   intersect
                   select e from unnest(pref_next) e),
             pref_next,
             iteration + 1
    from     cte,
    lateral  (select   your_table.preferences pref_next
              from     your_table
              where    your_table.name        = cte.name
              and      your_table.preferences > cte.pref_prev
              order by your_table.preferences
              limit    1) n
)
select   distinct on (name) name, pref_intersect
from     cte
order by name, iteration desc

http://rextester.com/ZQMGW66052

The main idea here is to find an ordering in which you can "walk" through your rows. I used the natural ordering of the preferences array (because not much of your columns are being showed). Ideally, this ordering should happen on (a) unique field(s) (preferably on the primary key), but here, because duplications in the preferences column does not influence the result of the intersection, it is good enough.

Upvotes: 1

Laurenz Albe
Laurenz Albe

Reputation: 246598

You could create your own aggregate function:

CREATE OR REPLACE FUNCTION arr_sec_agg_f(anyarray, anyarray) RETURNS anyarray
   LANGUAGE sql IMMUTABLE AS
   'SELECT CASE
              WHEN $1 IS NULL
              THEN $2
              WHEN $2 IS NULL
              THEN $1
              ELSE array_agg(x)
           END
    FROM (SELECT x FROM unnest($1) a(x)
          INTERSECT
          SELECT x FROM unnest($2) a(x)
         ) q';

CREATE AGGREGATE arr_sec_agg(anyarray) (
   SFUNC = arr_sec_agg_f(anyarray, anyarray),
   STYPE = anyarray
);

SELECT name, arr_sec_agg(preferences)
FROM person
GROUP BY name;

┌──────┬─────────────┐
│ name │ arr_sec_agg │
├──────┼─────────────┤
│ John │ {pizza}     │
│ Bill │             │
└──────┴─────────────┘
(2 rows)

Upvotes: 5

Related Questions