Reputation: 137997
I have target table and a staging table, for example:
create table #Target (Id int not null IDENTITY(1,1),Value int not null)
create table #Staging (TrackingId int not null, Value int not null)
insert #Staging (TrackingId, Value) values (201,1000),(202,1000),(203,2000),(204,2000)
Note that:
TrackingId
column, which is not included in the target table.I want to insert data from my staging table to my real table such that I keep the mapping from my #Staging.TrackingId
to the newly created #Target.Id
. My output, after the insertion, should look like this:
Id, TrackingId
1, 201
2, 202
3, 203
4, 204
(any other mapping is also possible, based on the order in which the insert occurs)
I tried two things. One is an order
-ed insert
, with an output
clause. This looks neat but I'm not sure SQL Server guarantees the order:
insert into #Target (Value)
OUTPUT inserted.Id --shame ", s.TrackingId" isn't allowed...
select s.Value from #Staging s
order by s.TrackingId
I also tried using MERGE
. This works, but seems wasteful (and a little dumb):
MERGE INTO #Target
USING #Staging AS s
ON 1=2 --dummy condition, we know we want an insert
WHEN NOT MATCHED BY TARGET THEN
INSERT (Value)
VALUES (s.Value)
OUTPUT INSERTED.Id, s.TrackingId
What is the correct approach for getting the newly created identity values and keeping the relation to my original order?
Upvotes: 0
Views: 316
Reputation: 48177
Yes. the insert will always work, once you include the order by, the insert will be executed in that order.
Here I change the staging order, btw you dont need OUTPUT
insert #Staging (TrackingId, Value) values (201,1000),(204,2000),(203,2000),(202,1000);
^ ^ ^ ^
INSERT INTO #Target (Value <, otherfields>)
SELECT TrackingID <, otherfields>
FROM #Staging
ORDER BY TrackingID
;
SELECT *
FROM #Target;
Please read the comments below in that article the answer from the author:
Yes, the identity values will be generated in the sequence established by the ORDER BY. If a clustered index exists on the identity column, then the values will be in the logical order of the index keys. This still doesn’t guarantee physical order of insertion. Index maintenance is a different step and that could also be done in parallel for example. So you could end up generating the identity values based on ORDER BY clause and then feeding those rows to the clustered index insert operator which will perform the maintenance task. You can see this in the query plan. You should really NOT think about physical operations or order but instead think of a table as a unordered set of rows. The index can be used to sort rows in logical manner (using ORDER BY clause) efficiently.
Upvotes: 1