Reputation: 1129
I have the following nested data in two columns:
Categories_A Categories_B
{"A"} {"B","F","C"}
{"B","A"} {"Z","B","F"}
{"B","F"} {"A","E","R"}
I would like to return in a new column the missing category/categories in categories_B when compared to categories_A - what is in Categories_A that is not in categories_B. So ideally:
Categories_A Categories_B Missing_Category
{"A"} {"B","F","C"} {"A"}
{"B","A"} {"Z","B","F"} {"A"}
{"B","F"} {"A","E","R"} {"B","F"}
So far what I managed is to return if there is a complete match between the columns:
select Categories_A, Categories_B,
case
when Categories_A = Categories_B then 'TRUE'
else 'FALSE'
end is_a_match
from facts_themes
Categories_A Categories_B is_a_match
{"A"} {"B","F","C"} FALSE
{"B","A"} {"Z","B","F"} FALSE
{"B","F"} {"A","E","R"} FALSE
Upvotes: 0
Views: 328
Reputation: 656872
While no duplicates can be involved, use the faster EXCEPT ALL
:
SELECT *, ARRAY(SELECT unnest(categories_a) EXCEPT ALL
SELECT unnest(categories_b)) AS missing
FROM facts_themes;
db<>fiddle here
If dupes can be involved, you'll first have to define desired behavior.
About EXCEPT ALL
:
Also note that EXCEPT
treats NULL as just another value (while most array operators do not).
Upvotes: 0
Reputation:
There is no built-in function for that, but it's easy to write your own:
create function array_except(p_one anyarray, p_two anyarray)
returns anyarray
as
$$
select array_agg(e)
from (
select e
from unnest(p_one) as p1(e)
except
select e
from unnest(p_two) as p2(e)
) x
$$
language sql
immutable
;
Then you can use it like this:
select categories_a, categories_b,
array_except(categories_a, categories_b) as missing_categories
from facts_themes
Upvotes: 1