Reputation: 287
This is my hive table:
course dept subject status
btech cse java pass
btech cse hadoop fail
btech cse cg detained
btech cse cc pass
btech it daa pass
btech it wt pass
btech it cnn pass
mba hr hrlaw pass
mba hr hrguid absent
mtech cs java pass
mtech cs cd pass
mtech cs cp detained
I want to query this table to retrieve data in the following way:
course dept status
btech cse fail
btech it pass
mba hr absent
mtech cs fail
First, it will check for "fail" or "detained" in the status
of each dept
and course
grouped together. If it finds "fail" or "detained", it will output "fail" as the status
. Else, if an "absent" is found in the same group, it will output "absent" as the status
. Else, it will output "pass".
I got an error message when I ran the following query:
select course,dept,
case
when status in ( 'fail','detained') then 'fail'
when status in ( 'absent') then 'absent'
when status in ( 'pass') then 'pass'
else null
end as Final_Status
from college
group by course,dept;
Upvotes: 10
Views: 93073
Reputation: 22651
If I understood correctly, you want something like:
select course,dept,
case
when status in ( 'fail','detained') then 'FAILED'
when status in ( 'absent') then 'absent'
when status in ( 'pass') then 'PASSED'
else null
end as Final_Status
from college
group by course,dept,
CASE when status in ( 'fail','detained') then 'FAILED'
when status in ( 'absent') then 'absent'
when status in ( 'pass') then 'PASSED'
else null END;
I am using CASE in GROUP and it works fine with Hive.
Upvotes: 6
Reputation: 21
Try this.
select course,dept,
collect_set(
case
when status in ( 'fail','detained') then 'FAILED'
when status in ( 'absent') then 'absent'
when status in ( 'pass') then 'PASSED'
else null
end ) as Final_Status
from college
group by course,dept;
Upvotes: 2
Reputation: 421
When you group by course and dept, you would get multiple values (comming for different records) for status column, this needs to be handled.
Any column in the select which is not part of group by should be within an aggregate function
here is a solution using sum() function.
select course, dept,
case when sum(case when status in ( 'fail','detained') then 1 else 0 end) > 0 then 'fail'
when sum(case when status in ('absent') then 1 else 0 end) > 0 then 'absent'
when sum(case when status in ('pass') then 1 else 0 end) > 0 then 'pass'
else 'no_result'
end as final_status
from college
group by
course,dept
Upvotes: 18
Reputation: 439
The problem is, The columns that are required for group by need to be in the end. Below the modified query, It should work now.
select
case
when status in ( 'fail','detained') then 'FAILED'
when status in ( 'absent') then 'absent'
when status in ( 'pass') then 'PASSED'
else null
end as Final_Status,course,dept
from college
group by course,dept;
Upvotes: 0