Holmes
Holmes

Reputation: 1079

GROUP BY statement HiveQL

I'm a noobie to Hive. My question is why we need to use collect_set(col) while performing GROUP BY?

select singer, collect_set(song) from songlist GROUP BY singer;;

would really appreciate any help. Thanks in advance!

Upvotes: 0

Views: 104

Answers (1)

Partha Kaushik
Partha Kaushik

Reputation: 690

Dude!! It is the other way around :)

All Summation/aggregation things need a group by. In your query, while you are trying to do a collect_set(col) you require a group by for it.

So In your case you are trying to group all songs sung by a singer. Hence the group by for the collect_set(songs)

Upvotes: 1

Related Questions