Kundan
Kundan

Reputation: 33

Sqoop copy data from MySQL table to a partitioned Hive table

I have written a Sqoop script:

HADOOP_USER_NAME=hdfs sqoop import --connect jdbc:mysql://cmsmaster.cy9mnipcdof2.us-east-1.rds.amazonaws.com/db  --username user -password-file /user/password/dbpass.txt --fields-terminated-by ','  --target-dir /user/db/sqoop_internal --delete-target-dir --hive-import --hive-overwrite --hive-table sqoop_internal --query '
SOME_QUERY where $CONDITIONS' --split-by id

This copies the result of the query and moves it to a Hive table, overwriting its previous content.

Now what I need is to modify this script so that it doesn't overwrite the whole Hive table. Instead, it should overwrite a partition of that Hive table. How to do that?

Upvotes: 1

Views: 189

Answers (1)

M. Alexandru
M. Alexandru

Reputation: 624

From your question i understand that you might need to do a sqoop merge.

You need to remove :

--delete-target-dir and --hive-overwrite

And add :

--incremental lastmodified --check-column modified --last-value '2018-03-08 00:00:00' --merge-key yourPrimaryKey

You can find more information from the official documentation. https://sqoop.apache.org/docs/1.4.2/SqoopUserGuide.html#_literal_sqoop_merge_literal

Upvotes: 1

Related Questions