Reputation: 725
I have a table which only has one column "tgt_dt" and 10 records: date "2015-09-01", date "2015-02-01", date "2015-08-10", etc.
I need to use these dates as condition to choose from anther table. i.e, in sql, i can just do:
select T2.tgt_dt, A,B, C,sum(D)
from TableA T1,
TableB T2
where T1.date between T2.tgt_dt - 90 and T2.tgt_dt
group by T2.tgt_dt, A,B, C
What's the best way to do this in hive?
Upvotes: 4
Views: 5854
Reputation: 356
You can just do a cartesian join here:
select T2.tgt_dt, A,B, C,sum(D)
from
TableA T1
JOIN
TableB T2
where T1.date between T2.tgt_dt - 90 and T2.tgt_dt
group by T2.tgt_dt, A,B, C
If you wanted a more efficient approach you could do something like:
select b.tgt_dt, A,B, C,sum(D)
from
TableA T1
LATERAL VIEW explode(array('date1','date2','date3')) b as tgt_dt
where T1.date between b.tgt_dt - 90 and b.tgt_dt
group by b.tgt_dt, A,B, C
Upvotes: 1