priyanka
priyanka

Reputation: 287

how to write case and group by in hive query

This is my hive table:

course   dept    subject   status

btech     cse     java     pass
btech     cse     hadoop   fail
btech     cse     cg       detained
btech     cse     cc       pass
btech      it     daa      pass
btech      it     wt       pass
btech      it     cnn      pass
mba        hr     hrlaw    pass
mba        hr     hrguid   absent
mtech      cs     java     pass
mtech      cs     cd       pass
mtech      cs     cp       detained

I want to query this table to retrieve data in the following way:

course   dept    status

btech     cse     fail
btech      it     pass
mba        hr     absent
mtech      cs     fail

First, it will check for "fail" or "detained" in the status of each dept and course grouped together. If it finds "fail" or "detained", it will output "fail" as the status. Else, if an "absent" is found in the same group, it will output "absent" as the status. Else, it will output "pass".

I got an error message when I ran the following query:

select course,dept,
case 
when status in ( 'fail','detained') then 'fail'
when status in ( 'absent') then 'absent'
when status in ( 'pass') then 'pass'
else null 
end as Final_Status
from college
group by course,dept;

Upvotes: 10

Views: 93073

Answers (4)

Thomas Decaux
Thomas Decaux

Reputation: 22651

If I understood correctly, you want something like:

select course,dept,
case 
when status in ( 'fail','detained') then 'FAILED'
when status in ( 'absent') then 'absent'
when status in ( 'pass') then 'PASSED'
else null 
end as Final_Status
from college
group by course,dept, 
   CASE when status in ( 'fail','detained') then 'FAILED'
   when status in ( 'absent') then 'absent'
   when status in ( 'pass') then 'PASSED'
   else null END;

I am using CASE in GROUP and it works fine with Hive.

Upvotes: 6

Lokesh
Lokesh

Reputation: 21

Try this.

select course,dept,
collect_set(
case 
when status in ( 'fail','detained') then 'FAILED'
when status in ( 'absent') then 'absent'
when status in ( 'pass') then 'PASSED'
else null 
end ) as Final_Status
from college
group by course,dept;

Upvotes: 2

b1n0ys
b1n0ys

Reputation: 421

When you group by course and dept, you would get multiple values (comming for different records) for status column, this needs to be handled.
Any column in the select which is not part of group by should be within an aggregate function
here is a solution using sum() function.

select course, dept,
    case when sum(case when status in ( 'fail','detained') then 1 else 0 end) > 0 then 'fail'
         when sum(case when status in ('absent') then 1 else 0 end) > 0 then 'absent'
         when sum(case when status in ('pass') then 1 else 0 end) > 0 then 'pass'
         else 'no_result'
    end as final_status
from college
group by 
    course,dept

Upvotes: 18

BalaramRaju
BalaramRaju

Reputation: 439

The problem is, The columns that are required for group by need to be in the end. Below the modified query, It should work now.

select 
case 
when status in ( 'fail','detained') then 'FAILED'
when status in ( 'absent') then 'absent'
when status in ( 'pass') then 'PASSED'
else null 
end as Final_Status,course,dept
from college
group by course,dept;

Upvotes: 0

Related Questions