Reputation: 619
I have a partitioned table in Hive. And for each node I have set the partitions limit to 2000.
set hive.exec.max.dynamic.partitions.pernode=2000
Now that after 2000 is reached, I am facing a problem. So i am trying to understand if there a possibility to remove this limitation?
Simply I don't want to set any value for hive.exec.max.dynamic.partitions.pernode
It should handle any number of partitions.
So could someone please help me on this?
Upvotes: 1
Views: 4180
Reputation: 339
I met this problem before. Add distribute by partition_column at the end of the SQL .
insert overwrite table table_A partition (date_id)
select xxxxxxxx
from table_B
distribute by date_id;
with "distribute by", same date_id value will be shuffled into one reducer. so a reducer may process several date_id instead of random date_id ( this may include all date_id ) .
Upvotes: 3
Reputation: 2469
As far as I know, that can't be done, Hive enforces a limit in the number of dynamic partitions that it can create. Up to my knowledge, this limitation has to do with the fact that each partition is stored in a separate HDFS directory, so the number of them is somehow bounded in advance to prevent performance issues.
Upvotes: 1