TrevDev
TrevDev

Reputation: 1106

TSQL Large Insert of Relational Data, W/ Foreign Key Upsert

Relatively simple problem.
Table A has ID int PK, unique Name varchar(500), and cola, colb, etc
Table B has a foreign key to Table A.

So, in the application, we are generating records for both table A and table B into DataTables in memory.

We would be generating thousands of these records on a very large number of "clients".

Eventually we make the call to store these records. However, records from table A may already exist in the database, so we need to get the primary keys for the records that already exist, and insert the missing ones. Then insert all records for table B with the correct foreign key.

Proposed solution:

I was considering sending an xml document to SQL Server to open as a rowset into TableVarA, update TableVarA with the primary keys for the records that already exist, then insert the missing records and output that to TableVarNew, I then select the Name and primary key from TableVarA union all TableVarNew.

Then in code populate the correct FKs into TableB in memory, and insert all of these records using SqlBulkCopy.

Does this sound like a good solution? And if so, what is the best way to populate the FKs in memory for TableB to match the primary key from the returned DataSet.

Upvotes: 3

Views: 729

Answers (2)

paparazzo
paparazzo

Reputation: 45096

This is more of a comment than a complete answer but I was running out of room so please don't vote it down for not being up to answer criteria.

My concern would be that evaluating a set for missing keys and then inserting in bulk you take a risk that the key got added elsewhere in the mean time. You stated this could be from a large number of clients so it this is going to happen. Yes you could wrap it in a big transaction but big transactions are hogs would lock out other clients.

My thought is to deal with those that have keys in bulk separate assuming there is no risk the PK would be deleted. A TVP is efficient but you need explicit knowledge of which got processed. I think you need to first search on Name to get a list of PK that exists then process that via TVP.

For data integrity process the rest one at a time via a stored procedure that creates the PK as necessary.

Thousands of records is not scary (millions is). Large number of "clients" that is the scary part.

Upvotes: 0

marc_s
marc_s

Reputation: 754598

Sounds like a plan - but I think the handling of Table A can be simpler (a single in-memory table/table variable should be sufficient):

  • have a TableVarA that contains all rows for Table A
  • update the ID for all existing rows with their ID (should be doable in a single SQL statement)
  • insert all non-existing rows (that still have an empty ID) into Table A and make a note of their ID

This could all happen in a single table variable - I don't see why you need to copy stuff around....

Once you've handled your Table A, as you say, update Table B's foreign keys and bulk insert those rows in one go.

What I'm not quite clear on is how Table B references Table A - you just said it had an FK, but you didn't specify what column it was on (assuming on ID). Then how are your rows from Table B referencing Table A for new rows, that aren't inserted yet and thus don't have an ID in Table A yet?

Upvotes: 3

Related Questions