asmgx
asmgx

Reputation: 8004

How to delete inserted records without having a key?

I have 2 tables: Tags and TagsTemp.

Tags has millions of rows while TagsTemp has thousands.

They both are physical tables.

The Primary Key in both tables is ID and it is Auto ID (and so IDs are in different numbers, they do not match).

I have to move all data from TagsTemp to Tags.

That is simple. I did this:

BEGIN TRANS
INSERT INTO Tags (C1, C2, C3)
SELECT C1, C2, C3 FROM TagsTemp

DELETE FROM TagsTemp

COMMIT

The problem is that TagsTemp could have new records during this transaction (during the time between the insert and the delete statements).

I do not want to delete table with records that were not inserted.

I was thinking of something like:

BEGIN TRANS
INSERT INTO @T (ID, C1, C2, C3)
SELECT ID, C1, C2, C3 FROM TagsTemp

INSERT INTO Tags (C1, C2, C3)
SELECT C1, C2, C3 FROM @T

DELETE FROM TagsTemp
WHERE ID IN (SELECT ID FROM @T)

COMMIT

But this is seems not the best way and bad performance.

Is there any better solution to do that?

Note: I cannot change Tables structure.

Upvotes: 3

Views: 827

Answers (4)

Gordon Linoff
Gordon Linoff

Reputation: 1269703

The other approaches are really just fine. However, a different method would be to use partitioning.

This assumes that you have an createdAt date in the temp table. Then, do the following:

  • create a new partition on the table with the current date/time
  • load the data from the temporary table using the current date/time
  • drop the old partition

The major advantage to this approach is that dropping a partition should be much, much faster than deleting rows.

Upvotes: 4

Dale K
Dale K

Reputation: 27208

Inspired by Bart Hofland you can in fact capture the IDs of the records you are transferring during the insert, so long as you use a MERGE statement as your insert as follows:

DECLARE @IDs TABLE (ID INT PRIMARY KEY);

BEGIN TRAN;

MERGE Tags target
USING TagsTemp source
ON 0 = 1
WHEN NOT MATCHED THEN 
INSERT (C1, C2, C3) VALUES (source.C1, source.C2, source.C3)
OUTPUT source.id INTO @IDs;

DELETE
FROM TagsTemp
WHERE ID IN (SELECT ID FROM @IDs);

COMMIT;

This allows a more flexible approach for when you are not basing your criteria on an auto-incrementing identity column.

Upvotes: 5

Bart Hofland
Bart Hofland

Reputation: 3905

The answer that Dale K provided is excellent.

However, if you (ever) need to move records over based on other criteria than its auto-incrementing primary key, you could use a temporary table or a table variable that stores the keys to delete:

DECLARE @IDs TABLE (ID INT PRIMARY KEY);

INSERT INTO @IDs (ID)
SELECT ID
FROM TagsTemp
---WHERE [whatever you want]

BEGIN TRAN;

INSERT INTO Tags (C1, C2, C3)
SELECT C1, C2, C3
FROM TagsTemp
WHERE ID IN (SELECT ID FROM @IDs);

DELETE
FROM TagsTemp
WHERE ID IN (SELECT ID FROM @IDs);

COMMIT;

With this logic, you can use any criteria in a WHERE-clause of the first SELECT-query.

Upvotes: 2

Dale K
Dale K

Reputation: 27208

For a few thousand rows I think your solution is perfectly good, but just to let you know another approach, you can obtain the MAX(ID) from the table and then use that, assuming your auto IDs are incrementing ints:

BEGIN TRAN;

DECLARE @MaxId INT;

SELECT @MaxId = MAX(ID) from TagsTemp;

INSERT INTO Tags (C1, C2, C3)
SELECT C1, C2, C3
FROM TagsTemp
WHERE ID <= @MaxId;

DELETE
FROM TagsTemp
WHERE ID <= @MaxId;

COMMIT;

PS: Not sure about your END TRANS - I assume you mean COMMIT?

Upvotes: 9

Related Questions