Reputation: 1038
Given the following data: (id: int, group: string)
+---------+-----------+
| id | group |
+---------+-----------+
| 1 | 'a,b,c' |
| 2 | 'a' |
| 3 | 'a,c' |
+---------+-----------+
I want to get a participant count of each group. The result should be:
+---------+--------+
| group | count |
+---------+--------+
| a | 3 |
| b | 1 |
| c | 2 |
+---------+--------+
How can I achieve it? How the query should look like? I have tried with split() function but without any results :/
Upvotes: 0
Views: 28
Reputation: 13387
This should do the trick:
select
main.group,
count(1)
from
(
select
explode(split(group, ",")) as group
from
main
) main
group by
main.group
If your query is more complex than the sample you provided, you might also like to read about lateral view
.
Ref. https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF
Upvotes: 1