asfallows
asfallows

Reputation: 6108

Remove unwanted elements from arrays

I have a table scraped_listings and a table scraped_categories. The column scraped_listings.categories is an integer[] array containing ids of rows in scraped_categories.

Somehow (probably through a mistake I don't remember making), some scraped_listings rows have ids in categories that do NOT belong to a category row (those rows have been deleted, I suspect).

I have the following query, which gives me the rows affected:

SELECT * FROM scraped_listings a
JOIN (
  SELECT array_agg(id) AS ids
  FROM scraped_categories
  ) b ON NOT a.categories <@ b.ids;

What I'd like to do now is to remove the ids which are not valid from categories for the rows found by this query - if an item in the array is not a valid scraped_category id, it should be dropped.

How might I do this?

Upvotes: 0

Views: 158

Answers (1)

Erwin Brandstetter
Erwin Brandstetter

Reputation: 656714

Postgres 9.2 or earlier:

UPDATE scraped_listings s
SET   categories = up.categories
FROM (
    SELECT a.pkey, array_agg(a.id) AS categories
    FROM  (
        SELECT pkey, unnest (categories) AS id
        FROM   scraped_listings
        ) a
    JOIN scraped_categories s USING (id) -- eliminates unwanted ids
    ) up
WHERE s.pkey = up.pkey

pkeybeing the undisclosed primary key column of scraped_listings.

In Postgres 9.3 you would use LATERAL for the correlated unnest():

UPDATE scraped_listings s
SET   categories = up.categories
FROM (
    SELECT a.pkey, array_agg(a.id) AS categories
    FROM  (
        SELECT pkey, c_id AS id
        FROM   scraped_listings l, unnest(l.categories) c_id  -- implicit LATERAL
        ) a
    JOIN scraped_categories s USING (id) -- eliminates unwanted ids
    ) up
WHERE s.pkey = up.pkey

Or you install the additional module intarray, that ships additional operators for int[], like:

int[] - int     int[]   remove entries matching right argument from array

Upvotes: 1

Related Questions