Reputation: 125
With SSIS, I am loading a staged extract into its master database. Part of the process is using LOOKUP to get the ID of a string value, and if it doesn't exist, insert it into the reference table (with OLE DB COMMAND) and return that ID through a followed lookup.
The LOOKUP says no match for the first ~50 values even after inserted. Is there a way I can fix this? Or, is there another approach I should try?
Some research/attempts...
Originally, no matches were being returned. I had to change all the lookups to "No Cache" so it would reload the reference table (they are very small, so I doubt I'll have big performance problems).
An alternative I've seen is multicast into different lookups (based on columns I'm looking up), and merge join them back together after I've inserted/looked them successfully. But from what I've read, there are heavy performance issues with this approach.
I expect the lookup to match or not match.If not match, insert new value; then, lookup value again for ID (if no match, fail). Union both original matched and original not matched to do next lookup.
Upvotes: 2
Views: 2576
Reputation: 125
Here is the final SSIS data flow model... Per @KeithL suggestion, I added sorts before the merge joins.
using System;
using System.Data;
using System.Data.SqlClient;
using Microsoft.SqlServer.Dts.Pipeline.Wrapper;
using Microsoft.SqlServer.Dts.Runtime.Wrapper;
public class ScriptMain : UserComponent
{
IDTSConnectionManager100 serverConMgr;
public override void PreExecute()
{
base.PreExecute();
serverConMgr = this.Connections.Connection;
}
public override void Input0_ProcessInputRow(Input0Buffer Row)
{
using (SqlConnection serverCon = new SqlConnection(serverConMgr.ConnectionString))
{
SqlCommand sql = new SqlCommand(
"INSERT INTO DIM.CONTRIBUTOR (CONTRIBUTOR) VALUES ('" + Row.CONTRIBUTOR + "') " +
"\n SELECT CAST(SCOPE_IDENTITY() AS INT) AS 'ID'"
, serverCon);
serverCon.Open();
SqlDataReader data = sql.ExecuteReader();
while (data.Read())
{
Row.CONTRIBUTORID = data.GetInt32(0);
}
data.Close();
}
}
}
The "script component" is the trickiest part. It inserts the new value and retrieves the corresponding ID. Below is the code for that script.
Note 3 important things...
Here are some other notes related to the script...
Upvotes: 1
Reputation: 5594
Try this pattern:
1.You are multicasting the dataset and grouping by just your lookup column
2.inserting your lookup and adding the scopeidentity(lookupid) to you grouped data
3.joining in the data you inserted to effective add your lookup value to every row
4.unioning your data back in
I am not positive, but you may need to add sorts to use merge join.
Upvotes: 1