Nipun
Nipun

Reputation: 4319

Adding partitions to the external table in hive takes a lot of time

I would like to know what is the best possible way(s) of adding partitions to the external table. I have a external table on S3 in hive with the partition as vehicle=/date=/hr=


Now new vehicle can be added at any time of day and there will be vehicles which will not have data for a couple of hours in a day or for couple of days.

Few possible solutions - msck reapir table : It takes a lot of time - Add partition via script : I may not know when new vehicle gets created or which hour data is not there for a vehicle

How do generally people solve this problem of adding partitions to the external tables

Upvotes: 2

Views: 439

Answers (1)

leftjoin
leftjoin

Reputation: 38290

msck reapir table is a right way to do this. If it runs too slow, try to switch off stats autogather before repair table:

set hive.stats.autogather=false;

You can enable it again after recovering partitions.

Most probably you are hitting HIVE-18743 or related bug. In my case this helped.

Upvotes: 1

Related Questions