Reputation: 3625
When I made a mistake in update query text, I spotted unpredictable query result. Here is query text for update.
DECLARE @T TABLE (Id int,[Name] nvarchar(100),RNA int)
INSERT INTO @T(Id,[Name])
SELECT [Id],[Name]
FROM (VALUES (1,N'D'),
(2,N'B'),
(3,N'S'),
(4,N'A'),
(5,N'F')
) AS vtable([Id],[Name])
UPDATE @T
SET RNA = T.RN
FROM (
select PP.Name,ROW_NUMBER() OVER(ORDER BY PP.Name) RN,PP.RNA from @T PP
) T
select * from @T
I know where mistake was made:
UPDATE @T
should be
UPDATE T
But why result (with "bad" query) looks like:
Id Name RNA
---- ----- -------
1 D 1
2 B 5
3 S 1
4 A 5
5 F 1
I suspect that 1 and 5 values are MIN(Id) and MAX(Id). Execution plan look like:
Will this situation be the same in every situation with this kind of mistake?
If yes, has this behaviour any practical value?
Upvotes: 4
Views: 156
Reputation: 69789
The situation will not be the same for every kind of mistake. You have a non-determinisitic update statement, that is to say theoritically any of the values for RN
in your subquery T
could be applied to any of the values in @T
. You are essentially running the UPDATE
version of this:
SELECT *
FROM @t a
CROSS JOIN
( SELECT TOP 1
PP.Name,
ROW_NUMBER() OVER(ORDER BY PP.Name) RN,
PP.RNA
FROM @T PP
ORDER BY NEWID()
) T
OPTION (FORCE ORDER);
The online manual states:
The results of an UPDATE statement are undefined if the statement includes a FROM clause that is not specified in such a way that only one value is available for each column occurrence that is updated, that is if the UPDATE statement is not deterministic.
What is slightly interesting is that if you run the above you will get a different result each time (barring the 1/25 chance of getting the same result twice in a row), if you remove the random sorting using NEWID()
you will get the same value of RN
for each row, but the update consistently returns the same results, with 2 different RN
s. I am not surprised the result remains consistent with no random ordering because with no changes to the data, and no random factor introduced I would expect the optimiser to come up with the same execution plan no matter how many times it is run.
Since no explicit ordering is specified in your update query, the order is due to the order of the records on the leaf, if the order of the records is altered the result is altered. This can be shown by inserting the records of @T
into a new table with different IDs
DECLARE @T2 TABLE (Id int,[Name] nvarchar(100),RNA int);
INSERT @T2
SELECT id, Name, NULL
FROM @T
ORDER BY ROW_NUMBER() OVER(ORDER BY NEWID())
OPTION (FORCE ORDER);
UPDATE @T2
SET RNA = T.RN
FROM (
select PP.Name,ROW_NUMBER() OVER(ORDER BY PP.Name) RN,PP.RNA from @T2 PP
) T
SELECT *
FROM @T2;
I can see no reason why this is always the min or max value of RN
though, I expect you would have to delve deep into the optimiser to find this. Which is probably a new question better suited for the dba stack exchange.
Upvotes: 2