Reputation: 609
I have a table with over 30 million rows and it takes too long to do a truncate and reload. I can do the Incremental Load using SSIS not matched output, however I am struggling to get the Delta loads where records have been updated. The Table has no Date fields so no way of checking when data was last modified.
Just wondering if anyone had come across a similar situation and how did you resolve it. I don't want load the full table as it can take 3 hours on just this one table alone!
Upvotes: 0
Views: 3512
Reputation: 381
You can use the Change Tracking technology supported by data stores such as Azure SQL Database and SQL Server can be used to identify the delta data. Find detailed implementation steps here, SYS_CHANGE_VERSION
Upvotes: 0
Reputation: 31
On the matched output, do another lookup that checks to see if the incoming row is different that the row with the matching primary key. If it is different, do an update, if it is the same, do nothing.
I know of two ways you can check to see if the rows are the same.
1) Add a derived column to the new incoming row that creates a hash value of every column. Then create the same hash column in your local table..compare the hashes in the lookup.
2) In the lookup, join every field...i.e. A-->A, B-->B, C-->C, etc. Then redirect the non-matches to do an update. This may not be possible if you have blob columns.
Number 1 is probably more efficient but harder to set up.
Upvotes: 0