Reputation: 2618
I have three partitioned tables (Yearly Partitions) in Hive and all the tables have multiple partitions. I am joining all these three tables as part of my requirement. Now I want to run this sql only for the latest partition and not for the ones created before.
I tried to use max(partition) in where clause but it seems it is not supported
I did something like below (not the exact code. Just a code concept)
select
a.*,
b.*,
c.*
from table1 a
left join table2 b on a.ID = b.ID
left join table3 c on a.ID = c.ID
where
a.year = max(a.year) and
b.year = max(b.year) and
c.year = max(c.year)
I got this error
FAILED: SemanticException [Error 10128]: Line 108:23 Not yet supported place for UDAF 'max'
I can use multiple where clauses with a subquery containing "select max(year) from table" for all the table but that's not seem to be a feasible one. Any ideas on how to achieve this ?
UPDATE I tried the where clause with conditions below but it seems only one suq-query is supported in the where clause. Not sure how to resolve this. Appreciating any inputs on this
where
a.year in (select max(year) from table1) and
b.year in (select max(year) from table2) and
c.year in (select max(year) from table3
Upvotes: 2
Views: 3130
Reputation: 3554
Modified version:
select
<columns>
from
(
select
<columns>
from
table1 a
where a.year in (select max(year) from table1)
) a1
left join
(
select
<columns>
from
table2 b
where b.year in (select max(year) from table2)
) b1 on a1.ID = b1.ID
left join
(
select
<columns>
from
table3 c
where c.year in (select max(year) from table3)
) c1 on a1.ID = c1.ID
;
Upvotes: 2