tgralex
tgralex

Reputation: 814

SQL Server: Is there any way to prevent accidental multiple updates of the same row in a single UPDATE Statement?

When you write an UPDATE statement from a join of two or more tables, there is always a possibility that you accidentally omitted one condition and it may end up updating the same row multiple times and lead to unexpected results, especially when there are complex keys/relationships.

Is there any way to ENSURE that if such situation happens, SQL Server raises an error or gives some kind of warning?

I'm usually careful on those things, but it happened to me few times recently, when I was trying to retrieve data from a not well known to me Database with complex relationships inside.

While my question is about SQL Server, how to prevent this situation, I'd be glad to hear how do you make sure its not happening?

Here is a small made up example of what I mean:

DECLARE @Customers TABLE (Id INT, Name VARCHAR(100), LatestInvoice VARCHAR(100))
DECLARE @Orders TABLE (Id INT, CustomerId INT, Invoice VARCHAR(100), Date DATETIME)

INSERT INTO @Customers (Id, Name)
VALUES (1, 'Customer1')

INSERT INTO @Orders (Id, CustomerId, Invoice, Date)
VALUES (1, 1, 'Invoice 1', '1/1/2019'),
       (2, 1, 'Invoice 2', '2/1/2019'),
       (3, 1, 'Invoice 3', '3/1/2019')

-- Correct UPDATE
-- one record updates once
UPDATE C
SET LatestInvoice = O.Invoice
FROM @Customers C 
JOIN @Orders O ON O.CustomerId = C.Id
WHERE O.Date = '3/1/2019'

-- Incorrect UPDATE
-- one record gets updated 3 times and result of Invoice could be anything
UPDATE C
SET LatestInvoice = O.Invoice
FROM @Customers C 
JOIN @Orders O ON O.CustomerId = C.Id

And BTW, how is such UPDATE mistake called?

Thanks a lot!

Upvotes: 0

Views: 2717

Answers (3)

tgralex
tgralex

Reputation: 814

Looks like if you have an UPDATE which might have multiple references to a target table rows, is the best to use 'MERGE' instead of 'UPDATE'

Unlike non-deterministic UPDATE the standard MERGE statement will generates an error if multiple source rows match one target row, requiring to revise the code to make it deterministic.

Here is the conversion of above UPDATE to the MERGE statement and the second one actually errors out!

-- works just fine
MERGE @Customers T
USING @Orders S
ON S.CustomerId = T.Id AND S.Date = '3/1/2019'
WHEN MATCHED
THEN 
    UPDATE SET LatestInvoice = S.Invoice;


-- omitting Date condition will follow with this error:
-- The MERGE statement attempted to UPDATE or DELETE the same row more than once. This happens when a target row matches more than one source row. A MERGE statement cannot UPDATE/DELETE the same row of the target table multiple times. Refine the ON clause to ensure a target row matches at most one source row, or use the GROUP BY clause to group the source rows.
MERGE @Customers T
USING @Orders S
ON S.CustomerId = T.Id --AND S.Date = '3/1/2019'
WHEN MATCHED
THEN 
    UPDATE SET LatestInvoice = S.Invoice;

Still, I think MS SQL should have have at least an option for the UPDATE statement to fail in non-deterministic updates as it certainly a mistake and leads to problems.

Upvotes: 1

Gordon Linoff
Gordon Linoff

Reputation: 1271121

You can use CROSS APPLY instead of JOIN:

UPDATE C
    SET LatestInvoice = O.Invoice
FROM @Customers C CROSS APPLY
     (SELECT TOP (1) O.*
      FROM @Orders O 
      WHERE O.CustomerId = C.Id
     ) O;

This will update once with an arbitrary matching row. You can add an ORDER BY to the subquery to provide more specification on the row that should be used.

EDIT:

I don't think there is a clean way to do this. I don't think there is a built-in function that will return an error from a query (such as throw() or raise_error() in T-SQL code). You can use the handy divide-by-zero error instead:

UPDATE C
    SET LatestInvoice = O.Invoice
FROM @Customers C JOIN
     (SELECT O.*, COUNT(*) OVER (PARTITION BY O.CustomerId) as cnt
      FROM @Orders O 
     ) O
     ON O.CustomerId = C.Id
WHERE (CASE WHEN cnt > 1 THEN 1 / 0 ELSE cnt END) = 1;

Upvotes: 1

Serg
Serg

Reputation: 22811

Not 100% defence, but .. Start designing an UPDATE with a SELECT

SELECT target.PrimaryKey, Count(*)
  -- update table expression here
GROUP BY target.PrimaryKey
HAVING Count(*) > 1

For example

SELECT C.id, Count(*)
-- update table expression here
FROM @Customers C 
JOIN @Orders O ON O.CustomerId = C.Id
-- 
GROUP BY C.id
HAVING Count(*) > 1

Upvotes: 1

Related Questions