Jeremy
Jeremy

Reputation: 46322

Record locking and concurrency issues

My logical schema is as follows: enter image description here
A header record can have multiple child records.

Multiple PCs can be inserting Child records, via a stored procedure that accepts details about the child record, and a value.

Multiple PCs can be querying unprocessed header records, via a stored procedure

So basically my header query looks like this:

BEGIN TRANSACTION;
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
SELECT TOP 1
    *
INTO
    #unprocessed
FROM
    Header h WITH (READPAST, UPDLOCK)
JOIN
    Child part1 ON part1.HeaderID = h.HeaderID AND part1.Name = 'XYZ'
JOIN
    Child part2 ON part1.HeaderID = part2.HeaderID AND 
WHERE
    h.Processed = 0x0;

UPDATE
    Header
SET
    Processed = 0x1
WHERE
    HeaderID IN (SELECT [HeaderID] FROM #unprocessed);  

SELECT * FROM #unprocessed

COMMIT TRAN

So the above query ensures that concurrent queries never return the same record.

I think my problem is on the insert query. Here's what I have:

DECLARE @HeaderID INT

BEGIN TRAN

--Create header record if it doesn't exist, otherwise get it's HeaderID
MERGE INTO
    Header WITH (HOLDLOCK) as target
USING
(
    SELECT 
        [Value] = @Value,  --stored procedure parameter
        [HeaderID]

) as source ([Value], [HeaderID]) ON target.[Value] = source.[Value] AND
                                     target.[Processed] = 0    
WHEN MATCHED THEN 
    UPDATE SET
        --Get the ID of the existing header
        @HeaderID = target.[HeaderID],
        [LastInsert] = sysdatetimeoffset() 
WHEN NOT MATCHED THEN
    INSERT
    (
        [Value]
    )
    VALUES
    (
        source.[Value]
    )


--Get new or existing ID
SELECT @HeaderID = COALESCE(@HeaderID , SCOPE_IDENTITY());

--Insert child with the new or existing HeaderID
INSERT INTO 
    [Correlation].[CorrelationSetPart]
    (
        [HeaderID],
        [Name]  
    )
VALUES
(
    @HeaderID,
    @Name --stored procedure parameter
);

My problem is that insertion query is often blocked by the above selection query, and I'm receiving timeouts. The selection query is called by a broker, so it can be called fairly quickly. Is there a better way to do this? Note, I have control over the database schema.

Upvotes: 3

Views: 162

Answers (1)

Vladimir Baranov
Vladimir Baranov

Reputation: 32675

To answer the second part of the question

You only ever want one machine PC to query and process each header record. There should never be an instance where a header record and it's children should be processed by more than one PC

Have a look at sp_getapplock.

I use app locks within the similar scenario. I have a table of objects that must be processed, similar to your table of headers. The client application runs several threads simultaneously. Each thread executes a stored procedure that returns the next object for processing from the table of objects. So, the main task of the stored procedure is not to do the processing itself, but to return the first object in the queue that needs processing. The code may look something like this:

CREATE PROCEDURE [dbo].[GetNextHeaderToProcess]
AS
BEGIN
    -- SET NOCOUNT ON added to prevent extra result sets from
    -- interfering with SELECT statements.
    SET NOCOUNT ON;

    BEGIN TRANSACTION;
    BEGIN TRY

        DECLARE @VarHeaderID int = NULL;

        DECLARE @VarLockResult int;
        EXEC @VarLockResult = sp_getapplock
            @Resource = 'GetNextHeaderToProcess_app_lock',
            @LockMode = 'Exclusive',
            @LockOwner = 'Transaction',
            @LockTimeout = 60000,
            @DbPrincipal = 'public';

        IF @VarLockResult >= 0
        BEGIN
            -- Acquired the lock
            -- Find the most suitable header for processing
            SELECT TOP 1
                @VarHeaderID = h.HeaderID
            FROM
                Header h
                JOIN Child part1 ON part1.HeaderID = h.HeaderID AND part1.Name = 'XYZ'
                JOIN Child part2 ON part1.HeaderID = part2.HeaderID
            WHERE
                h.Processed = 0x0
            ORDER BY ....;
            -- sorting is optional, but often useful
            -- for example, order by some timestamp to process oldest/newest headers first

            -- Mark the found Header to prevent multiple processing.
            UPDATE Header
            SET Processed = 2 -- in progress. Another procedure that performs the actual processing should set it to 1 when processing is complete.
            WHERE HeaderID = @VarHeaderID;
            -- There is no need to explicitly verify if we found anything. 
            -- If @VarHeaderID is null, no rows will be updated
        END;

        -- Return found Header, or no rows if nothing was found, or failed to acquire the lock
        SELECT
            @VarHeaderID AS HeaderID
        WHERE
            @VarHeaderID IS NOT NULL
        ;

        COMMIT TRANSACTION;
    END TRY
    BEGIN CATCH
        ROLLBACK TRANSACTION;
    END CATCH;

END

This procedure should be called from the procedure that does actual processing. In my case the client application does the actual processing, in your case it may be another stored procedure. The idea is that we acquire the app lock for the short time here. Of course, if the actual processing is fast you can put it inside the lock, so only one header can be processed at a time.

Once the lock is acquired we look for the most suitable header to process and then set its Processed flag. Depending on the nature of your processing you can set the flag to 1 (processed) right away, or set it to some intermediary value, like 2 (in progress) and then set it to 1 (processed) later. In any case, once the flag is not zero the header will not be chosen for processing again.

These app locks are separate from normal locks that DB puts when reading and updating rows and they should not interfere with inserts. In any case, it should be better than locking the whole table as you do WITH (UPDLOCK).

Returning to the first part of the question

You only ever want one header record inserted for any given "value". So if two child records are inserted with the same "Value" supplied, the header should only be created once.

You can use the same approach: acquire app lock in the beginning of the inserting procedure (with some different name than the app lock used in querying procedure). Thus you would guarantee that inserts happen sequentially, not simultaneously. BTW, in practice most likely inserts can't happen simultaneously anyway. The DB would perform them sequentially internally. They will wait for each other, because each insert locks a table for update. Also, each insert is written to transaction log and all writes to transaction log are also sequential. So, just add sp_getapplock to the beginning of your inserting procedure and remove that WITH (HOLDLOCK) hint in the MERGE.

The caller of the GetNextHeaderToProcess procedure should handle correctly the situation when procedure returns no rows. This can happen if the lock acquisition timed out, or there are simply no more headers to process. Usually the processing part simply retries after a while.

Inserting procedure should check if the lock acquisition failed and retry the insert or report the problem to the caller somehow. I usually return the generated identity ID of the inserted row (the ChildID in your case) to the caller. If procedure returns 0 it means that insert failed. The caller may decide what to do.

Upvotes: 1

Related Questions