raghuram gururajan
raghuram gururajan

Reputation: 563

How to use aggregate functions in Hive on Group by columns

When I try to use a inbuilt UDF function or my own UDF function on the GroupBy columns as below in hive I seem to be getting error

select col1, col2 from xyz group by my_func(col1), col2

It keeps complaining column –col1 not found in group by expression.

Upvotes: 1

Views: 4548

Answers (3)

aiman
aiman

Reputation: 1103

select col1, col2 from xyz group by my_func(col1) as col1, col2 

The basic is that your GROUP BY needs to have all the cols that you have mentioned in SELECT clause.

Upvotes: 0

Joe K
Joe K

Reputation: 18424

When you apply a function to a column, it is not longer called the same thing. You should name it explicitly using the as keyword.

select group1, group2 from xyz group by my_func(col1) as group1, col2 as group2;

Also, if you're only selecting the columns that you're grouping by, not the actual grouped data, maybe distinct would be more appropriate than group by?

Upvotes: 3

Greg Ross
Greg Ross

Reputation: 3498

The call to the aggregate function is in the wrong place. It should be made as follows:

Select my_func(col1),col2 from xyz group by col1,col2

Upvotes: 1

Related Questions