Reputation: 359
I have this table
+------------+---------------------+--+
| country | commodity |
+------------+---------------------+--+
| Argentina | Copper, molybdenum |
| Argentina | Silver, lead |
| Argentina | Copper, gold |
| Argentina | Copper, gold |
| Argentina | Copper |
| Spain | Rhodochrosite |
| Spain | Copper |
| Spain | Limestone |
| Spain | Gold |
| Spain | Limestone |
+------------+---------------------+--+
And I want to display this
+------------+-----------------------------------------+--+
| country | minerals |
+------------+-----------------------------------------+--+
| Argentina | copper, molybdenum, silver, lead, gold |
| Spain | rhodochrosite, copper, limestone, gold |
+------------+-----------------------------------------+--+
So I want do join all commodity for each country in one column "minerals" and eliminate duplicates, but in original column "commodity" there can be more then 1 minerals as you can see in first table and also there can be lower or upper case Gold, gold etc.
I tried
SELECT country, CONCAT_WS(', ' ,COLLECT_SET(LOWER(commodity))) as minerals
FROM depositOPT
GROUP BY country;
But it didn't eliminate duplicates because output looks like this
+------------+------------------------------------------------------------------------
| country | minerals
+------------+------------------------------------------------------------------------
| Argentina | copper, molybdenum, silver, lead, copper, gold, copper, gold, copper
| Spain | rhodochrosite, copper, limestone, gold, limestone
+------------+------------------------------------------------------------------------
Thanks for advice.
Upvotes: 1
Views: 990
Reputation: 12684
I will split up the commodity columns into individual minerals to remove duplicates then group by country. Hope this helps. Thanks.
select t.country,
collect_set(lower(t.minerals)) as minerals
from (select country, trim(minerals) as minerals
from depositOPT lateral view explode(split(commodity,',')) s as minerals) t
group by t.country;
**Sample result:**
Country Minerals
Argentina ["copper","molybdenum","silver","lead","gold"]
Upvotes: 1