Sano babu
Sano babu

Reputation: 105

Counting columns with a where clause

Is there a way to count a number of columns which has a particular value for each rows in Hive. enter image description here I have data which looks like in input and I want to count how many columns have value 'a' and how many column have value 'b' and get the output like in 'Output'. Is there a way to accomplish this with Hive query?

Upvotes: 0

Views: 123

Answers (2)

Vamsi Prabhala
Vamsi Prabhala

Reputation: 49270

Use lateral view with explode on the data and do the aggregations on it.

select id
,sum(cast(col='a' as int)) as cnt_a
,sum(cast(col='b' as int)) as cnt_b
,sum(cast(col in ('a','b') as int)) as cnt_total
from tbl
lateral view explode(array(ci_1,ci_2,ci_3,ci_4,ci_5)) tbl as col
group by id

Upvotes: 2

Gordon Linoff
Gordon Linoff

Reputation: 1271151

One method in Hive is:

select ( (case when cl_1 = 'a' then 1 else 0 end) +
         (case when cl_2 = 'a' then 1 else 0 end) +
         (case when cl_3 = 'a' then 1 else 0 end) +
         (case when cl_4 = 'a' then 1 else 0 end) +
         (case when cl_5 = 'a' then 1 else 0 end)
       ) as count_a,
       ( (case when cl_1 = 'b' then 1 else 0 end) +
         (case when cl_2 = 'b' then 1 else 0 end) +
         (case when cl_3 = 'b' then 1 else 0 end) +
         (case when cl_4 = 'b' then 1 else 0 end) +
         (case when cl_5 = 'b' then 1 else 0 end)
       ) as count_b
from t;

To get the total count, I would suggest using a subquery and adding count_a and count_b.

Upvotes: 2

Related Questions