Reputation: 316
what will happen incase sqoop fails between large data import job. Will it persist some of the data onto hdfs before the job failure occured ?
Upvotes: 0
Views: 4032
Reputation: 2337
I believe import and export works on the similar principles of transactions.
Since Sqoop breaks down export process into multiple transactions, it is possible that a failed export job may result in partial data being committed to the database.
This can further lead to subsequent jobs failing due to insert collisions in some cases, or lead to duplicated data in others.
Solution You can overcome this problem by specifying a staging table via the --staging-table option which acts as an auxiliary table that is used to stage exported data. The staged data is finally moved to the destination table in a single transaction.
Upvotes: 1