how to apply max clause on column other than group by columns in Hive

Question

I have a hive table that contains data like below.

Table
---------------------
c1     c2      c3
a      1       7
a      2       6
a      3       3
a      3       1
a      3       2

I want to write a query to get value 2 from c3 column. The logic is, for column c1 select max(c2) and then within that max(c2) find max(c3)

I wrote query like

select c1, max(c3) from table1 
group by c1
having c2=max(c2)

but this did not work as Hive says that I can use only those columns in having clause that are part of group by.

Please help me with this.

Note:- I need a single query for this. I am able to write the same in two queries

leftjoin · Accepted Answer

with your_data as (
select stack (5,
'a',1,7,
'a',2,6,
'a',3,3,
'a',3,1,
'a',3,2) as (c1,c2,c3)
)

select c1, max(c3) as max_c3
from
(
select c1,c2,c3,
       rank() over(partition by c1 order by c2 desc) rn --max(c2) marked 1
  from your_data
)s where rn=1 --filter records with max(c2)
group by c1

Result:

c1  max_c3  
a   3

how to apply max clause on column other than group by columns in Hive

Answers (2)

Related Questions