Thelonious Monk
Thelonious Monk

Reputation: 466

Dealing with duplicates based on certain conditions SQL hive

I have a table with two columns:

Key    Date
1      a
1      a
1      NULL
2      NULL
2      NULL
3      b
3      NULL
4      c
4      c

I need final output like this:

Key     Date
1       a
2       NULL
3       b
4       c

In other words, a key can have multiple equal dates,multiple equal dates and a null, all nulls, or one date one null. I need to remove duplicates from my table but am not able to create these rules. Any help would be greatly appreciated!

Upvotes: 0

Views: 61

Answers (2)

鄭有維
鄭有維

Reputation: 265

Try the sql Query:

select Distinct a.[Key],b.[Date] from TEST a left join TEST b on a.[Key] = b.[Key] and b.[Date] is not null 

enter image description here

Upvotes: 1

Ashish Singh
Ashish Singh

Reputation: 533

This will give the desired result

  select distinct key, date from Table_Name  , 
( select key as x ,size(collect_list(date)) as a from Table_Name group by key )
temp   
where (Table_Name.key = temp.x and a=0) OR  date is not null;

Upvotes: 1

Related Questions