Saikat Roy
Saikat Roy

Reputation: 23

How to do a row wise count in Hive

h1      h2      h3      h4      h5      h6      h7      h8
U       U       NULL    U       Y       NULL    Y       X
U       NULL    U       U       Y       Y       X       X
U       U       U       NULL    U       NULL    Y       NULL
NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL
X       V       U       U       Y       NULL    Z       X
Y       X       NULL    X       Y       Z       U
X       NULL    U       NULL    NULL    U       Z       Y
NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL

Above data set is has 8 columns h1,h2,h3......h8. if all the all the column is NULL then the count is 0. if at least one column has value then the count is 1. like the first row has a count of 6.(do not consider null values).

And here X= boy,v= girl, u= wife, z=husbend, y= head. so, how can we find the count by type(ex. boy or wife or head etc) for every row wise. like how many wife and how many husbend or how many girl are there?

OUTPUT SHOULD BE LIKE

X(BOY)=8 U(WIFE)=15 ETC....

FROM THE ABOVE DATA SET...HAVE TO COUNT TOTAL NUMBER OF MEMBER PRESENT BY TYPE.

Upvotes: 2

Views: 396

Answers (2)

leftjoin
leftjoin

Reputation: 38325

Concatenate string, remove everything except 'U' (for wife), calculate lenth then sum all of them. Please see comments in the code:

select --get total sum()
      sum( Wife_cnt    ) as Wife_cnt,
      sum( Boy_cnt     ) as Boy_cnt,
      sum( Head_cnt    ) as Head_cnt,
      sum( Girl_cnt    ) as Girl_cnt,
      sum( Husband_cnt ) as Husband_cnt

 from( --remove everything except particular character and count them (length)
 select length(regexp_replace(row_concatenated, '[^U]', '')) as Wife_cnt,
        length(regexp_replace(row_concatenated, '[^X]', '')) as Boy_cnt,
        length(regexp_replace(row_concatenated, '[^Y]', '')) as Head_cnt,
        length(regexp_replace(row_concatenated, '[^V]', '')) as Girl_cnt,
        length(regexp_replace(row_concatenated, '[^Z]', '')) as Husband_cnt
from( --concatenate the string
  select concat_ws(',',h1,h2,h3,h4,h5,h6,h7,h8) as row_concatenated
  from table_name
  )s
  )s;

Upvotes: 1

Agung Sriwongo
Agung Sriwongo

Reputation: 133

I think you can use concat_ws to concat all columns, concat_ws of multiple null values will return an empty string. With this you can then get the length of your total columns each row (in your example each cell has only one chracter) using length function, this will return the count of non null columns in each row.

SELECT LENGTH(CONCAT_WS('',h1,h2,h3,h4,h5,h6,h7,h8)) as h9 from your_table;

Upvotes: 0

Related Questions