user2531569
user2531569

Reputation: 619

how to remove the limit of partitions per node in Hive?

I have a partitioned table in Hive. And for each node I have set the partitions limit to 2000.

set hive.exec.max.dynamic.partitions.pernode=2000

Now that after 2000 is reached, I am facing a problem. So i am trying to understand if there a possibility to remove this limitation?

Simply I don't want to set any value for hive.exec.max.dynamic.partitions.pernode It should handle any number of partitions.

So could someone please help me on this?

Upvotes: 1

Views: 4180

Answers (2)

Mike Gan
Mike Gan

Reputation: 339

I met this problem before. Add distribute by partition_column at the end of the SQL .

insert overwrite table table_A partition (date_id)
select xxxxxxxx
from table_B
distribute by date_id;

with "distribute by", same date_id value will be shuffled into one reducer. so a reducer may process several date_id instead of random date_id ( this may include all date_id ) .

Upvotes: 3

Jaime Caffarel
Jaime Caffarel

Reputation: 2469

As far as I know, that can't be done, Hive enforces a limit in the number of dynamic partitions that it can create. Up to my knowledge, this limitation has to do with the fact that each partition is stored in a separate HDFS directory, so the number of them is somehow bounded in advance to prevent performance issues.

Upvotes: 1

Related Questions