Reputation: 419
Step 1: Grab all sorts of data from existing legacy systems and dump them into a series of staging tables in my database.
Step 2: Move the data from my staging tables into a more relational set of tables that I'm using specifically for my project.
In step 1 I'm just doing a bulk SELECT
and a bulk INSERT
; however, in step 2 I'm doing row-by-row inserts into my tables using OLEDB Command tasks so that I can log very specific row-level activity of everything that's happening. Here is my general layout for step 2 processes.
alt text http://dl.dropbox.com/u/2468578/screenshots/step_1.png
You'll notice 3 OLEDB tasks: 1 for the actual INSERT
, and 2 for success/fail INSERT
s into our logging table.
The main thing I'm logging is source table/id and destination table/id for each row that passes through this flow. I'm storing this stuff in variables and adding them to the data flow using a Derived Column so that I can easily map them to the query parameters of the stored procedures.
alt text http://dl.dropbox.com/u/2468578/screenshots/step_3.png
I've decided to store these logging values in variables instead of hard-coding the values in the SqlCommand field on the task, because I'm pretty sure you CAN'T put variable expressions in that field (i.e. exec storedproc @[User::VariableName],... ,... ,...
). So, this is the best solution I've found.
alt text http://dl.dropbox.com/u/2468578/screenshots/step_2.png
Is this the best solution? Probably not.
Is it good performance wise to add 4 logging columns to a data flow that consists of 500,000 records? Probably not.
Can you think of a better way?
Upvotes: 0
Views: 723
Reputation: 3690
to add to Cade's answer if you truly need the logging info on a row by row basis, your best best is to leverage the oledb destination and use one or both of the following transformations to add columns to the dataflow:
This should be your best bet and should't add much overhead
Upvotes: 0
Reputation: 89741
I really don't think calling an OLEDBCommand 500,000 times is going to be performant.
If you are already going to staging tables - load it all to a staging table and take it from there in T-SQL or even another dataflow (or to a raw file and then something else depending on your complete operation). A Bulk insert is going to be hugely more efficient.
Upvotes: 1