Reputation: 339
I have a User table with a ReferenceId column, which points to a Reference table. Each user needs a ReferenceId which has to first be generated by an INSERT into a Reference table. The ReferenceId is a unique number that is different for each and every user.
User.ReferenceId = Reference.Id
User Table
+------------+----------+-------------+--+
| First Name | LastName | ReferenceId | |
+------------+----------+-------------+--+
| Bob | Smith | 123 | |
| John | Smith | 456 | |
+------------+----------+-------------+--+
Reference Table
+-----+----------+--+
| Id | Type | |
+-----+----------+--+
| 123 | External | |
| 456 | Internal | |
+-----+----------+--+
I will need to perform an operation to create two million users. I understand that if I create one user at a time sequentially, I can use the Output keyword to capture each generated ReferenceId, and then use it in my INSERT statement to the User table.
But is it possible to use batch/bulk operations here to improve performance so that I can create all the ReferenceIds in a single query, and then map all of them to a BULK INSERT to the User table, so that all new users can be created with their corresponding ReferenceIds in a single Query?
Upvotes: 0
Views: 158
Reputation: 164829
You could use an insert trigger to insert a reference for each user and also update the users table.
create trigger on_insert_add_reference on "user"
after insert
as
declare @ref_ids table(id int, user_id int)
set nocount on
begin
-- For each inserted user, insert into reference.
-- Store the reference id and user id pair in @ref_ids.
merge reference
using inserted as inserted_users
on (1 = 0)
when not matched then
insert ("type") values('external')
output inserted.id, inserted_users.id into @ref_ids;
-- Update the just inserted users with their reference IDs.
update "user"
set reference_id = [@ref_ids].id
from "user"
join inserted on inserted.id = "user".id
join @ref_ids on [@ref_ids].user_id = "user".id;
end
You'll also need a delete trigger to delete references when the user is deleted.
One problem with this approach is users.reference_id
cannot be made not null
. It's possible it could be rewritten as an instead of insert
trigger.
Since this is a 1-to-1 relationship, it is much simpler to put the reference ID and type into users. Since it is "a unique number that is different for each and every user" it could simply be their primary key.
Upvotes: 0
Reputation: 1574
GMB and others covered the design aspect of this; this is definitely not an ideal way of separating your data. However, to answer your question:
If the only way to capture which ReferenceID goes to which User is during the INSERT into ReferenceTable, then we either need to step in and capture this before moving on to the next ReferenceTable entry (which means doing these 1 at a time), or we manually configure this so no capture is necessary.
Best way to do this in bulk is to use a staging table. This staging table will enable you to keep a key between the two tables en masse. Bulk insert all your 2M users into a staging table; add a new column named referenceID; populate this with whatever your next set of records would be, and then use this staging table to create your final records. Force identity insert if necessary.
-- bulk load everything into stagingTable using your preferred method
-- Create sequence to hold your new referenceID
-- Start higher than your current max referenceID to avoid overlap
create sequence myNewIDSEQ
start with 456
increment by 1;
go
-- Create column to hold new referenceID
alter table stagingTable add referenceID int;
go
-- Update referenceID with a unique value for each entry
update stagingTable set referenceID = NEXT VALUE FOR myNewIDSEQ;
-- Merge your staging table against your Reference table to create all these new users
MERGE [referenceTable] as dest
USING stagingTable as src
ON dest.referenceID = src.referenceID
WHEN NOT MATCHED THEN
INSERT (referenceID, [type])
VALUES (referenceID, [type]);
-- Now Merge your staging table against your User table to create those users
MERGE [userTable] as dest
USING stagingTable as src
ON dest.referenceID = src.referenceID
WHEN NOT MATCHED THEN
INSERT (referenceID, firstName, lastName)
VALUES (referenceID, firstName, lastName);
Upvotes: 1