Reputation: 169
I have following tables and need to combine them in hive
Could any one please help me how can we achieve this. I tried date part with coalesce and it is fine. But fam part is not able to merge into single column.
Really appreciate your help.
Thanks, Babu
Upvotes: 0
Views: 134
Reputation: 1270873
You can use full outer join
. However, union
with left join
s often looks cleaner:
select df.date, df.name,
coalesce(t1.famcnt1, 0) as famcnt1,
coalesce(t2.famcnt2, 0) as famcnt2,
coalesce(t3.famcnt3, 0) as famcnt3
from ((select date, fam1 from table1
) union -- on purpose to remove duplicates
(select date, fam1 from table2
) union -- on purpose to remove duplicates
(select date, fam1 from table3
)
) df left join
table1 t1
on t1.date = df.date and t1.name = df.name left join
table2 t2
on t2.date = df.date and t2.name = df.name left join
table3 t3
on t3.date = df.date and t3.name = df.name;
If you are happy with NULL
instead of 0
, then no COALESCE()
s are needed at all.
Upvotes: 1
Reputation: 1782
Use full outer join in two steps, e.g.
with join1 as (
select coalesce(t1.date, t2.date) as date
, coalesce(fam1, fam2) as fam
, coalesce(famcnt1, 0) as famsct1
, coalesce(famcnt2, 0) as famsct2
from table1 as t1
full outer join table2 as t2
on (t1.date = t2.date and fam1 = fam2)
)
select coalesce(t1.date, t3.date) as date
, coalesce(fam, fam3) as fam
, coalesce(famcnt1, 0) as famsct1
, coalesce(famcnt2, 0) as famsct2
, coalesce(famcnt3, 0) as famsct3
from join1 as t1
full outer join table3 as t3
on (t1.date = t3.date and fam = fam3)
Upvotes: 1