Nathon
Nathon

Reputation: 165

How to load and update changing data using SQOOP?

I tried with increment import but I have to specify either append or lastmodified. I need all changes, both modified and new records.

Upvotes: 0

Views: 335

Answers (1)

Dev
Dev

Reputation: 13753

Your understading is partially correct here.

As per the docs,

An alternate table update strategy supported by Sqoop is called lastmodified mode. You should use this when rows of the source table may be updated, and each such update will set the value of a last-modified column to the current timestamp. Rows where the check column holds a timestamp more recent than the timestamp specified with --last-value are imported.

(Emphasis is mine)

Now lets try to understand this using an example,

I have a table employees having name, salary and updated_on fields. Sample record:

name |salary | updated_on

dev  | 2000  | 2016-01-01

Now salary of the some of the employees has changed in the next month and some new employee joined in that month.

Now in your Sqoop Import command, you will specify --check-column updated_on, --incremental lastmodified and --last-value "2016-01-01".

All the records added or updated after this --last-value will be imported.

Upvotes: 1

Related Questions