yioann
yioann

Reputation: 47

SQL Server locks - avoid insertion of duplicate entries

After reading a lot of articles and many answers related to the above subject, I am still wondering how the SQL Server database engine works in the following example:

Let's assume that we have a table named t3:

create table t3 (a int , b int);
create index test on t3 (a);

and a query as follow:

INSERT INTO T3
SELECT -86,-86
WHERE NOT EXISTS (SELECT 1 FROM t3 where t3.a=-86);

The query inserts a line in the table t3 after verifying that the row does not already exist based on the column "a".

Many articles and answers indicate that using the above query there is no way that a row will be inserted twice.

For the execution of the above query, I assume that the database engine works as follow:

  1. The subquery is executed first.
  2. The database engine sets a shared(s) lock on a range.
  3. The data is read.
  4. The shared lock is released. According to MSDN a shared lock is released as soon as the data has been read.
  5. If a row does not exist it inserts a new line in the table.
  6. The new line is locked with an exclusive lock (x)

Now consider the following scenario:

  1. The above query is executed by processor A (SPID 1).
  2. The same query is executed by a processor B (SPID 2).
  3. [SPID 1] The database engine sets a shared(s) lock
  4. [SPID 1] The subquery reads the data. Now rows are returned.
  5. [SPID 1] The shared lock is released.
  6. [SPID 2] The database engine sets a shared(s) lock
  7. [SPID 2] The subquery reads the data. No rows are return.
  8. [SPID 2] The shared lock is released.
  9. Both processes proceed with a row insertion (and we get a duplicate entry).

Am I missing something? Is the above way a correct way for avoiding duplicate entries?

A safe way to avoid duplicate entries is using the code below, but I am just wondering whether the above method is correct.

begin tran
    if (SELECT 1 FROM t3 with (updlock) where t3.a=-86)
    begin
        INSERT INTO T3
        SELECT -86,-86
    end
commit

Upvotes: 3

Views: 6322

Answers (2)

Cade Roux
Cade Roux

Reputation: 89661

If you just have a unique constraint on the column, you'll never have duplicates.

The technique you've outlined will avoid you having to catch an error or an exception in the case of the (second "simultaneous") operation failing.

I'd like to add that relying on "outer" code (even T-SQL) to enforce your database consistency is not a great idea. In all cases, using declarative referential integrity at the table level is important for the database to ensure consistency and matching expectations, regardless of whether application code is written well or not. As in security, you need to utilize a strategy of defense in depth - constraints, unique indexes, triggers, stored procedures, and views can all assist in making a multi-layered approach to ensure the database presents a consistent and reliable interface to the application or system.

Upvotes: 8

Andomar
Andomar

Reputation: 238086

To keep locks between multiple statements, they have to be wrapped in a transaction. In your example:

If (SELECT 1 FROM t3 with (updlock) where t3.a=-86)
    INSERT INTO T3 SELECT -86,-86

The update lock can be released before the insert is executed. This would work reliably:

begin transaction
If (SELECT 1 FROM t3 with (updlock) where t3.a=-86)
    INSERT INTO T3 SELECT -86,-86
commit transaction

Single statements are always wrapped in a transaction, so this would work too:

 INSERT INTO T3 SELECT -86,-86
 WHERE NOT EXISTS (SELECT 1 FROM t3 with (updlock) where t3.a=-86)

(This is assuming you have "implicit transactions" turned off, like the default SQL Server setting.)

Upvotes: 4

Related Questions