Reputation: 814
When you write an UPDATE
statement from a join of two or more tables, there is always a possibility that you accidentally omitted one condition and it may end up updating the same row multiple times and lead to unexpected results, especially when there are complex keys/relationships.
Is there any way to ENSURE that if such situation happens, SQL Server raises an error or gives some kind of warning?
I'm usually careful on those things, but it happened to me few times recently, when I was trying to retrieve data from a not well known to me Database with complex relationships inside.
While my question is about SQL Server, how to prevent this situation, I'd be glad to hear how do you make sure its not happening?
Here is a small made up example of what I mean:
DECLARE @Customers TABLE (Id INT, Name VARCHAR(100), LatestInvoice VARCHAR(100))
DECLARE @Orders TABLE (Id INT, CustomerId INT, Invoice VARCHAR(100), Date DATETIME)
INSERT INTO @Customers (Id, Name)
VALUES (1, 'Customer1')
INSERT INTO @Orders (Id, CustomerId, Invoice, Date)
VALUES (1, 1, 'Invoice 1', '1/1/2019'),
(2, 1, 'Invoice 2', '2/1/2019'),
(3, 1, 'Invoice 3', '3/1/2019')
-- Correct UPDATE
-- one record updates once
UPDATE C
SET LatestInvoice = O.Invoice
FROM @Customers C
JOIN @Orders O ON O.CustomerId = C.Id
WHERE O.Date = '3/1/2019'
-- Incorrect UPDATE
-- one record gets updated 3 times and result of Invoice could be anything
UPDATE C
SET LatestInvoice = O.Invoice
FROM @Customers C
JOIN @Orders O ON O.CustomerId = C.Id
And BTW, how is such UPDATE mistake called?
Thanks a lot!
Upvotes: 0
Views: 2717
Reputation: 814
Looks like if you have an UPDATE
which might have multiple references to a target table rows, is the best to use 'MERGE' instead of 'UPDATE'
Unlike non-deterministic UPDATE the standard MERGE statement will generates an error if multiple source rows match one target row, requiring to revise the code to make it deterministic.
Here is the conversion of above UPDATE
to the MERGE
statement and the second one actually errors out!
-- works just fine
MERGE @Customers T
USING @Orders S
ON S.CustomerId = T.Id AND S.Date = '3/1/2019'
WHEN MATCHED
THEN
UPDATE SET LatestInvoice = S.Invoice;
-- omitting Date condition will follow with this error:
-- The MERGE statement attempted to UPDATE or DELETE the same row more than once. This happens when a target row matches more than one source row. A MERGE statement cannot UPDATE/DELETE the same row of the target table multiple times. Refine the ON clause to ensure a target row matches at most one source row, or use the GROUP BY clause to group the source rows.
MERGE @Customers T
USING @Orders S
ON S.CustomerId = T.Id --AND S.Date = '3/1/2019'
WHEN MATCHED
THEN
UPDATE SET LatestInvoice = S.Invoice;
Still, I think MS SQL should have have at least an option for the UPDATE statement to fail in non-deterministic updates as it certainly a mistake and leads to problems.
Upvotes: 1
Reputation: 1271121
You can use CROSS APPLY
instead of JOIN
:
UPDATE C
SET LatestInvoice = O.Invoice
FROM @Customers C CROSS APPLY
(SELECT TOP (1) O.*
FROM @Orders O
WHERE O.CustomerId = C.Id
) O;
This will update once with an arbitrary matching row. You can add an ORDER BY
to the subquery to provide more specification on the row that should be used.
EDIT:
I don't think there is a clean way to do this. I don't think there is a built-in function that will return an error from a query (such as throw()
or raise_error()
in T-SQL code). You can use the handy divide-by-zero error instead:
UPDATE C
SET LatestInvoice = O.Invoice
FROM @Customers C JOIN
(SELECT O.*, COUNT(*) OVER (PARTITION BY O.CustomerId) as cnt
FROM @Orders O
) O
ON O.CustomerId = C.Id
WHERE (CASE WHEN cnt > 1 THEN 1 / 0 ELSE cnt END) = 1;
Upvotes: 1
Reputation: 22811
Not 100% defence, but .. Start designing an UPDATE with a SELECT
SELECT target.PrimaryKey, Count(*)
-- update table expression here
GROUP BY target.PrimaryKey
HAVING Count(*) > 1
For example
SELECT C.id, Count(*)
-- update table expression here
FROM @Customers C
JOIN @Orders O ON O.CustomerId = C.Id
--
GROUP BY C.id
HAVING Count(*) > 1
Upvotes: 1