Reputation: 37
In PostgreSQL I have a table (Table
) that contains an id column (ID
) and another column (Values
) that contains an array of strings. I have a select query (SelectQuery
) that gets me an ID
that matches Table.ID
, as well as an array of values (RemoveValues
). I would like to now remove from the Values
array, any strings that are contained in the RemoveValues
array that match on ID
, updating Table.
--Table--
ID Values
-- ----------------------------
1 {string1, string2, string3}
2 {string1, string2, string3}
...
--SelectQuery--
ID RemoveValues
-- ----------------------------
1 {string2}
2 {string1, string2}
...
Execute query that I can't figure out
--Table--
ID Values
-- ----------------------------
1 {string1, string3}
2 {string3}
...
I can't seem to figure out a good way to do this, I've played with a number of joins and the sql array methods and haven't found anything that works. Is there's a way to loop the remove()
method for each element in RemoveValues
? I've also tried something like this with no luck:
select array_agg(elem)
from Table, unnest(Table.Values) elem
where elem <> all(
SelectQuery )
);
I'm feeling like my remaining option is a bash script, unless anyone can suggest a path forward using SQL?
Upvotes: 2
Views: 3337
Reputation: 656481
There are many ways.
First, you have to define whether there can be duplicates or NULL
values in either of the arrays - and how to deal with those if possible. And whether to preserve original order of elements.
Assuming either is possible, and these are the rules:
Count duplicates separately. So if Values
has five elements 'foo' and RemoveValues
has three elements 'foo', thee will be two element 'foo' in the result.
Treat NULL
values equal. So NULL
can be removed with NULL
(though in other contexts NULL = NULL
yields NULL
).
Order of array elements does not have to be preserved.
<SelectQuery>
produces unique ID
values.
That matches the behavior of standard SQL EXCEPT ALL
. See:
So:
SELECT t.*, sub.result_array
FROM tbl t
JOIN (<SelectQuery>) s USING ("ID")
CROSS JOIN LATERAL (
SELECT ARRAY (
SELECT unnest(t."Values")
EXCEPT ALL
SELECT unnest(s."RemoveValues")
)
) sub(result_array);
About the LATERAL
subquery:
Can be plain CROSS JOIN
because an ARRAY constructor always produces a row.
You could wrap the functionality in a custom function:
CREATE FUNCTION f_arr_except_arr(a1 text[], a2 text[])
RETURNS text[]
LANGUAGE SQL IMMUTABLE PARALLEL SAFE
BEGIN ATOMIC
SELECT ARRAY (SELECT unnest(a1) EXCEPT ALL SELECT unnest(a2));
END;
Requires Postgres 14. See:
For older (or any) Postgres versions, and for any array type:
CREATE OR REPLACE FUNCTION f_arr_except_arr(a1 anyarray, a2 anyarray)
RETURNS anyarray
LANGUAGE SQL IMMUTABLE PARALLEL SAFE AS
$func$
SELECT ARRAY (SELECT unnest(a1) EXCEPT ALL SELECT unnest(a2));
$func$;
db<>fiddle here
(PARALLEL SAFE
only for Postgres 9.6 or later, though.)
Then your query can be something like:
SELECT *, f_arr_except_arr(t."Values", s."RemoveValues") AS result_values
FROM tbl t
JOIN (<SelectQuery>) s USING ("ID");
It's unclear whether you actually locked in CaMeL-case spelling with double-quotes. I assumed as much, but better you don't. See:
And your UPDATE
can be:
UPDATE tbl t
SET "Values" = f_arr_except_arr(t."Values", s."RemoveValues")
FROM <SelectQuery>) s
WHERE s."ID" = t."ID"
AND "Values" IS DISTINCT FROM f_arr_except_arr(t."Values", s."RemoveValues");
The additional WHERE
clause is optional to avoid empty updates. See:
Upvotes: 1