ROTHUSTLE
ROTHUSTLE

Reputation: 145

SSIS 2008. Transferring data from one table to another ONLY if the data is not duplicated

I'm going to do my best to try to explain this. I currently have a data flow task that has an OLE DB Source transferring data from a table from a different database to a table to another database. It works fine but the issue I'm having is the fact that I keep adding duplicate data to the destination table.

So a CustomerID of '13029' with an amount of '$56.82' on Date '11/30/2012' is seen in that table multiple times. How do I make it so I can only have unique data transferring over to that destination table?

Upvotes: 1

Views: 3125

Answers (3)

Tom
Tom

Reputation: 116

In the dataflow task, where you transfer the data, you can insert a Lookup transformation. In the lookup, you can specify a data source (table or query, what serves you best). When you chose the data source, you can go to the Columns view and create a mapping, where you connect the CustomerID, Date and Amount of both tables.

In the general view, you can configure, what happens with matched/non matched row. Simply take the not matched output and direct it to the DB destination.

Upvotes: 3

billinkc
billinkc

Reputation: 61259

You will need to identify what makes that data unique in the table. If it's a customer table, then it's probably the customerid of 13029. However if it's a customer order table, then maybe it's the combination of CustomerId and OrderDate (and maybe not, I have placed two unique orders on the same date). You will know the answer to that based on your table's design.

Armed with that knowledge, you will want to write a query to pull back the keys from the target table SELECT CO.CustomerId, CO.OrderId FROM dbo.CustomerOrder CO If you know the process only transfers data from the current year, add a filter to the above query to restrict the number of rows returned. The reason for this is memory conservation-you want SSIS to run fast, don't bring back extraneous columns or rows it will never need.

Inside your dataflow, add a Lookup Transformation with that query. You don't specify 2005, 2008 or 2012 as your SSIS version and they have different behaviours associated with the Lookup Transformation. Generally speaking, what you are looking to do is identify the unmatched rows. By definition, unmatched means they don't exist in the target database so those are the rows that are new. 2005 assumes every row is going to match or it errors. You will need to click the Configure Error Output... button and select "Redirect Rows". 2008+ has an option under "Specify how to handle rows with no matching entries" and there you'll want "Redirect rows to no match output."

Now take the No match output branch (2008+) or the error output branch (2005) and plumb that into your destination.

What this approach doesn't cover is detecting and handling when the source system reports $56.82 and the target system has $22.38 (updates). If you need to handle that, then you need to look at some change detection system. Look at Andy Leonard's Stairway to Integration Services series of articles to learn about options for detecting and handling changes.

Upvotes: 2

Dominic Goulet
Dominic Goulet

Reputation: 8113

Have you considered using the T-SQL MERGE statement? http://technet.microsoft.com/en-us/library/bb510625.aspx

It will compare both tables on defined fields, and take an action if matched or not.

Upvotes: 0

Related Questions