Christopher Miller
Christopher Miller

Reputation: 11

SQL Server - Inconsistencies with how transactions are handled when Read Committed Snapshot is enabled

Overview:

SQL seems to read "dirty" information from an incomplete transaction in another session when performing an update even though we are attempting to read only committed data. Some very specific criteria must be true for this issue to occur (it is difficult to reproduce - but we can do so here).

How to reproduce the behavior

SQL2017 or SQL2019 or Azure SQL

  1. Identify a database for testing and turn on Read Committed Snapshot. (Set READ_COMMITTED_SNAPSHOT ON WITH NO_WAIT)

  2. Run the ScenarioPrep.SQL script (below) in the database to create the test objects.

    • This creates 2 tables -- a "parent" table and a "child" table.
    • Notice that the parent table has two entries, "Earth" and "Mars"
    • Notice that the child table has 1 entry with the id of the parent row with the name "Earth". Mars has no child.
  3. Prepare to run the two remaining scripts (below) simultaneously by first opening the two additional SSMS Sessions, with One.sql in one window and Two.sql in another window.

    • Notice that One.sql sets initial conditions, then has two update statements wrapped inside a transaction.
    • Notice that Two.sql attempts to update the records also being updated in the first script.
  4. Execute the One.sql script. This script has a delay in the middle of the transaction to help reproduce the test conditions.

  5. While One.sql is running (specifically during the delay), execute Two.sql in the second window.

  6. Notice the unexpected results in the second window. The Earth Record was updated unexpectedly. The update should not have succeeded because there was at no time a committed transaction that had the parent "Earth" record in a state of 2 and a child record in a state of 0. Before the transaction, Earth and its child record were both in a status of 0. After the transaction, Both Earth and its child were in a status of 2.

However, Two.sql does succeed in updating the "Earth" record because it somehow reads the parent as being in a status of 2 and the child in a status of 0 during the delay -- BUT THIS IS NOT A COMMITTED TRANSACTION AT THIS TIME. This is a "dirty" state that should not be seen when the update occurs. The update should see the state as it was either before or after the committed transaction in the other session.

The expected result would be that the Two.sql script would not update any records because at no point is there a COMMITTED transaction where earth is in a state of 2 and its child is in a state of 0. Indeed, under most conditions, this proves true.

Observations

The problem only occurs if ALL of the following conditions are true:

  1. Database is in Read Committed Snapshot mode.
  2. The Mars record is present (even though it's not affected by either transaction)
  3. The Mars record is in a status of 2 (partially matching the where clause in Two.sql)
  4. The Mars record is earlier in the table (has a lower primary key value) than the Earth record.
  5. The action performed is an UPDATE (replacing the update with the select gives expected results).
  6. The update occurs during a join across two or more tables.
  7. The transaction mode for the Two.SQL is Read Committed (Ironically, Read Uncommitted correctly blocks until the other transaction is done and works as one would expect).
  8. There must, of course, be a sufficient delay in the first transaction for the test scenario to occur.

As implied above, any of the following changes will cause the problem to go away: Changing the order of the Mars and Earth Record in the parent table. Deleting the mars record from the parent table (proving that an unrelated record affects how the other is updated.) Putting the Mars record in a status other than 2. Changing the Update to some other operation, such as select. Turning of Read Committed Snapshot mode.

Of course, removing the transaction from around the two update statements in One.sql would cause the issue to occur all the time, but that would be expected in that case. The whole purpose of the two update statements being wrapped inside a transaction is to avoid this scenario.

Repo Code

Below are the three code files necessary to reproduce the behavior:

ScenarioPrep.sql:

drop table if exists dbo.parent
drop table if exists dbo.child

SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

CREATE TABLE [dbo].[Parent](
    Id int NOT NULL,
    [Name] [varchar](100) NULL,
    [StatusID] [int] NULL
 CONSTRAINT [PK_Parent] PRIMARY KEY CLUSTERED 
(
    [Id] ASC
)WITH (STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]
GO



SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

CREATE TABLE [dbo].[Child](
    
    ID INT NOT NULL,
    [ParentID] INT NULL,
    [StatusID] [int] NULL,
 CONSTRAINT [PK_Child] PRIMARY KEY CLUSTERED 
(
    [Id] ASC
)WITH (STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]
GO

INSERT INTO dbo.parent(ID,name,statusid)
values
    (200,'Earth',0),--"Earth" must have an id (guid or int) greater than "Mars" for behavior to occur. Use comparison to know which guid is greater. String compare may give different results.
    (100,'Mars',0)
    
    
    

INSERT INTO dbo.child(ID,ParentID,StatusID)
VALUES
    (201,(SELECT ID from dbo.parent where name = 'Earth'),0)
    
    
GO

One.sql:

--Reset test scenario - put only mars in a status of 2. 

update dbo.parent set statusid = 2 where name = 'Mars'--make sure mars partially matches where clause
update dbo.parent set statusid = 0 where name = 'Earth'--make sure earth starts in 0 status
update dbo.child set statusid = 0 --make sure earth's child starts in 0

--Here's where the actual test begins.
--Note: At this point, before the transaction below, the commited transactions have earth and its child both in a status of 0

        begin transaction --wrap in a transaction. We want all or nothing here.
         
             update dbo.parent set statusid = 2 where name = 'Earth'

             waitfor delay '00:00:15';
             --Note: At this point, only uncommited/dirty data shows earth in 2 and its child in 0. There are no commited transactions reflecting this state.
             --it is during this moment we rung BugTwo.sql
             update c set statusid = 2 from dbo.child c
             inner join dbo.parent p
             on c.parentid = p.id


        commit transaction
--Note: after the transaction, earch and it's child are both in a status of 2.
--If we are reading only commited data, we should not see a scenario where earth is 2 and its child is 0.

Two.Sql:

 --Note: make sure read committed snapshot is on before running these tests.

--read only commited data...

--run this while one.sql is running!!

set transaction isolation level read committed
        
    update p set statusid = 1300--a status that will only occurr if the conditions below are met.
    --select * --replacing the update with a select causes the strange behavior to go away.
    from dbo.parent p
    inner join child c
    on p.id = c.parentid
    where p.statusid=2 --where earth is 2
    and c.statusid=0 --and its child is 0
    --note that there are no committed transactions that reflect this scenario. We would expecte this not to update any records.
    
    
        
--yet earth gets updated...
SELECT * FROM dbo.parent where statusid = 1300
--we are attempting to read committed data only, so why did we see and uncommited scenario?

Upvotes: 1

Views: 476

Answers (2)

user14570231
user14570231

Reputation: 129

This is not a bug, but it is perhaps counter-intuitive. Nevertheless, it is required for correctness.

The behaviour of statements that modify data under one of the row-versioning isolation levels is (lightly) documented in the Transaction Locking and Row Versioning Guide:

In a read-committed transaction using row versioning, the selection of rows to update is done using a blocking scan where an update (U) lock is taken on the data row as data values are read. This is the same as a read-committed transaction that does not use row versioning. If the data row does not meet the update criteria, the update lock is released on that row and the next row is locked and scanned.

In your example, the update in session 2 reads the parent table using update semantics to ensure we are modifying the record using the very latest committed state. This is necessary to avoid lost updates. The update therefore sees the parent record as it is after the transaction in session 1 commits.

This only applies to the parent table, because it is the target of the update. The child table in session 2 is read using read committed snapshot semantics (committed version as at the start time of the statement) because it is not the target of the update.

You can find more detail on this in Data Modifications under Read Committed Snapshot Isolation by Paul White.

Upvotes: 0

SQLpro
SQLpro

Reputation: 5161

You are confusing what does READ COMMITTED TRANSACTION ISOLATION state do...

This isolation level guaranty you that you will not have some dirty reads, but the locks induced by this isolation level apply only for each statement, not for the all transaction.

By using the REPEATABLE READ isolation level, the rows locked will be maintenaid until the finalization of the transaction.

A test that I use systematically, in my engineering courses, to demonstrate these different levels of isolation, is the following: https://sqlpro.developpez.com/isolation-transaction/

Step 0 - in a SSMS Window - creating the database :

USE master
GO

IF EXISTS (SELECT *
           FROM   master.sys.databases
           WHERE  name = 'DB_ISO_LEVEL')
   DROP DATABASE DB_ISO_LEVEL
GO

CREATE DATABASE DB_ISO_LEVEL
GO

USE DB_ISO_LEVEL
GO

CREATE TABLE T_ISO ( COL INT)
GO

INSERT INTO T_ISO VALUES (1)
INSERT INTO T_ISO VALUES (2)
INSERT INTO T_ISO VALUES (3)
GO

USE master
GO

past and run it immedtaly...

Step 1 - In another SSMS window - a first transaction :

USE DB_ISO_LEVEL
GO

SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED 
BEGIN TRANSACTION TRAN1

DECLARE @TOTAL INT

SELECT @TOTAL = SUM(COL)
FROM   T_ISO

WAITFOR DELAY '00:00:20'

 
Sélectionnez

SELECT @TOTAL = @TOTAL - SUM(COL)
FROM   T_ISO

SELECT @TOTAL AS TOTAL

COMMIT TRANSACTION
SET TRANSACTION ISOLATION LEVEL READ COMMITTED

USE master
GO

Do not runs yet... Step 2 - a second transaction - In a third SSMS Windows

USE DB_ISO_LEVEL
GO

BEGIN TRANSACTION TRAN2

UPDATE T_ISO
SET    COL = COL + 1

WAITFOR DELAY '00:00:20'

ROLLBACK TRANSACTION

USE master
GO

Now run step 1 and 5 second later step 2 this will produce a dirty read (the expected value is 0...)

Run the database drop and create (Step 0 in Window 1).

Modify the ISOLATION LEVEL of step 1 by READ COMMITTED and run step 1, and 5 second later step 2. The dirty read Disappears and the value is really 0...

Upvotes: 0

Related Questions