Babu
Babu

Reputation: 169

Multiple counts in hive query not giving expected results

I have data as follows.

enter image description here

I am trying to get the following results with the query what I have developed.

select date,name,act_cd,type_cd, type 
count(distinct CASE WHEN act_cd in (1) THEN key_id ELSE 0 END) as count-of-AB,
count(distinct CASE WHEN act_cd in (2) THEN key_id ELSE 0 END) as count-of-CD
FROM table 
where act_cd in (1,2)
and type in (MR,LP)
group by date,name,act_cd,type_cd,type

Expected Results

enter image description here

But its giving following results. Not sure what is the wrong with the query and could soneone let me know how can we get expected results.

Actual Results

enter image description here

Really appreciate your help.

Thanks,Babu

Upvotes: 0

Views: 365

Answers (2)

Gordon Linoff
Gordon Linoff

Reputation: 1271121

Remove the ELSE:

select date, name, type_cd, type 
       count(distinct CASE WHEN act_cd in (1) THEN key_id END) as count_of_AB,
       count(distinct CASE WHEN act_cd in (2) THEN key_id END) as count_of_CD
from table 
where act_cd in (1, 2) and
      type in ('MR','LP')
group by date, name, type_cd, type;

The issue with your code is that the 0 counts as a value for count(distinct).

In addition, it is not appropriate to have act_cd in the group by.

If key_id is guaranteed to be unique (at least within a group), then don't use distinct. I prefer sum() but you can also use count():

select date, name, type_cd, type 
       count(CASE WHEN act_cd in (1) THEN key_id END) as count_of_AB,
       count(CASE WHEN act_cd in (2) THEN key_id END) as count_of_CD
from table 
where act_cd in (1, 2) and
      type in ('MR','LP')
group by date, name, type_cd, type;

Upvotes: 1

eshirvana
eshirvana

Reputation: 24623

just a minor bug otherwise your query is ok, you need to pass null in else to not count it :

select date,name,act_cd,type_cd, type 
count(distinct CASE WHEN act_cd in (1) THEN 1 ELSE NULL END) as count-of-AB,
count(distinct CASE WHEN act_cd in (2) THEN 1 ELSE NULL END) as count-of-CD
FROM table 
where act_cd in (1,2)
and type in (MR,LP)
group by date,name,act_cd,type_cd,type

Upvotes: 0

Related Questions