Reputation:
I'm trying to write code for a batch import of lots of rows into the database.
Currently I bulk copy the raw data (from a .csv file) into a staging table, so that it's all at the database side. That leaves me with a staging table full of rows that identify 'contacts'. These now need to be moved into other tables of the database.
Next I copy over the rows from the staging table that I don't already have in the contacts table, and for the ones I do already have, I need to update the column named "GroupToBeAssignedTo", indicating a later operation I will perform.
I have a feeling I'm going about this wrong. The query isn't efficient and I'm looking for advice of how I could do this better.
update [t1]
set [t1].GroupToBeAssignedTo = [t2].GroupToBeAssignedTo from Contacts [t1]
inner join ContactImportStaging [t2] on [t1].UserID = [t2].UserID AND [t1].EmailAddress = [t2].EmailAddress AND [t2].GUID = @GUID
where not exists
(
select GroupID, ContactID from ContactGroupMapping
where GroupID = [t2].GroupToBeAssignedTo AND ContactID = [t1].ID
)
Might it be better to just import all the rows without checking for duplicates first and then 'clean' the data afterwards? Looking for suggestions of where I'm going wrong. Thanks.
EDIT: To clarify, the question is regarding MS SQL.
Upvotes: 2
Views: 223
Reputation: 23493
This answer is slightly "I wouldn't start from here", but it's the way I'd do it ;)
If you've got the Standard or Enterprise editions of MS SQL Server 2005, and you have access to SQL Server Integration Services, this kind of thing is a doddle to do with a Data Flow.
Probably more individual steps than a single insert-with-select statement, but it'll save your staging, and it's pretty intuitive to follow. Plus, you're probably already licenced to use it, and it's pretty easy :)
Upvotes: 1
Reputation: 425341
Next I copy over the rows from the staging table that I don't already have in the contacts table
It seems that implies that ContactGroupMapping
does not have records matching Contacts.id
, in which case you can just omit the EXISTS
:
UPDATE [t1]
SET [t1].GroupToBeAssignedTo = [t2].GroupToBeAssignedTo
FROM Contacts [t1]
INNER JOIN
ContactImportStaging [t2]
ON [t1].UserID = [t2].UserID
AND [t1].EmailAddress = [t2].EmailAddress
AND [t2].GUID = @GUID
Or I am missing something?
Upvotes: 0