Mikel Laburu
Mikel Laburu

Reputation: 157

Migrate Impala commands to Hive

I have to migrate some Impala shell commands to Hive. They are quite simple commands but I'm a bit lost with them, because I know what each of them does but I wouldn't know their equivalent form in Hive.

TABLE=$(impala-shell -i ${server} --delimited --quiet -q "select concat(db_normalized,'.',tb_normalized) from parametric_table where source='testSource' and product='testProduct' limit 1" 2>/dev/null)

nohup impala-shell -i ${server} -q "REFRESH $TABLE;" >> ${logsPath}/impalaRefresh.out &

The first command is getting the database name and the table name from a parametric table based on some parameters which will then be used in the second command to run a REFRESH on it.

Sorry if this is a pretty simple task but I am new to Impala and Hive.

Upvotes: 0

Views: 166

Answers (1)

leftjoin
leftjoin

Reputation: 38290

REFRESH TABLE is Impala-specific command: Impala caches table metadata and after table was loaded or altered by Hive, you need to run Refresh in Impala, Hive has no such command, so if you are migrating to Hive, and will not use Impala, just remove those commands.

If you are loading partition folders by some means other than Hive, then you may need to run MSCK REPAIR TABLE or ALTER TABLE RECOVER PARTITIONS (if on AWS EMR). You may also want to gather Hive table statistics using ANALYZE TABLE ... COMPUTE STATISTICS

Also Hive 3 is caching results and metadata. But you can only switch it on/off, not to refresh particular table.

Upvotes: 0

Related Questions