zzzbbx
zzzbbx

Reputation: 10131

Find columns with non zero values in Hive

Let's say you have a database with n columns, from col1 to coln, where n is large.

Would it be possible to find all rows such that at least one column from colk to coln has a non zero value (assuming columns are non-negative numbers, and numbers may be missing)?

Upvotes: 0

Views: 2120

Answers (2)

Prashanthhh Kumarr
Prashanthhh Kumarr

Reputation: 87

why so many complicated queries use this;

INSERT INTO DB.1.table_name_1 SELECT * FROM DB.2_table_name_2 WHERE table_name_1.COL_NAME > 0

INSERT OVERWRITE DB.1.table_name_1 SELECT * FROM DB.2_table_name_2 WHERE table_name_1.COL_NAME > 0

Upvotes: 0

Michael
Michael

Reputation: 371

You could use Hive's coalesce function (which takes the first non-null value of a series of column inputs) combined with an if statement, something like:

select * 
from table
where coalesce(if(col1 > 0, 1, null), if(col2 > 0, 1, null)...) = 1
;

The above will return any row where at least one column specified in the coalesce function returned a value of 1. Let me know if this works for you.

Edit: Another method which is a little cleaner (doesn't require you to list all columns) but less flexible:

select *
from table tb
where sort_array(array(tb.*))[n-1] > 0
;

The above will sort the array ascending so you can check if the largest value in the array is greater than zero and return only those rows.

Upvotes: 1

Related Questions