user255048
user255048

Reputation: 373

SQL - Inserting and Updating Multiple Records at Once

I have a stored procedure that is responsible for inserting or updating multiple records at once. I want to perform this in my stored procedure for the sake of performance.

This stored procedure takes in a comma-delimited list of permit IDs and a status. The permit IDs are stored in a variable called @PermitIDs. The status is stored in a variable called @Status. I have a user-defined function that converts this comma-delimited list of permit IDs into a Table. I need to go through each of these IDs and do either an insert or update into a table called PermitStatus.

If a record with the permit ID does not exist, I want to add a record. If it does exist, I'm want to update the record with the given @Status value. I know how to do this for a single ID, but I do not know how to do it for multiple IDs. For single IDs, I do the following:

-- Determine whether to add or edit the PermitStatus
DECLARE @count int
SET @count = (SELECT Count(ID) FROM PermitStatus WHERE [PermitID]=@PermitID)

-- If no records were found, insert the record, otherwise add
IF @count = 0
BEGIN
  INSERT INTO
    PermitStatus
  (
    [PermitID],
    [UpdatedOn],
    [Status]
  )
  VALUES
  (
    @PermitID,
    GETUTCDATE(),
    1
  )
  END
  ELSE
    UPDATE
      PermitStatus
    SET
      [UpdatedOn]=GETUTCDATE(),
      [Status]=@Status
    WHERE
      [PermitID]=@PermitID

How do I loop through the records in the Table returned by my user-defined function to dynamically insert or update the records as needed?

Upvotes: 7

Views: 11612

Answers (6)

AdaTheDev
AdaTheDev

Reputation: 147224

If you're using SQL Server 2008, you can use table valued parameters - you pass in a table of records into a stored procedure and then you can do a MERGE.

Passing in a table valued parameter would remove the need to parse CSV strings.

Edit:
ErikE has raised the point about race conditions, please refer to his answer and linked articles.

Upvotes: 3

ErikE
ErikE

Reputation: 50211

There are various methods to accomplish the parts you ask are asking about.

Passing Values

There are dozens of ways to do this. Here are a few ideas to get you started:

  • Pass in a string of identifiers and parse it into a table, then join.
  • SQL 2008: Join to a table-valued parameter
  • Expect data to exist in a predefined temp table and join to it
  • Use a session-keyed permanent table
  • Put the code in a trigger and join to the INSERTED and DELETED tables in it.

Erland Sommarskog provides a wonderful comprehensive discussion of lists in sql server. In my opinion, the table-valued parameter in SQL 2008 is the most elegant solution for this.

Upsert/Merge

  • Perform a separate UPDATE and INSERT (two queries, one for each set, not row-by-row).
  • SQL 2008: MERGE.

An Important Gotcha

However, one thing that no one else has mentioned is that almost all upsert code, including SQL 2008 MERGE, suffers from race condition problems when there is high concurrency. Unless you use HOLDLOCK and other locking hints depending on what's being done, you will eventually run into conflicts. So you either need to lock, or respond to errors appropriately (some systems with huge transactions per second have used the error-response method successfully, instead of using locks).

One thing to realize is that different combinations of lock hints implicitly change the transaction isolation level, which affects what type of locks are acquired. This changes everything: which other locks are granted (such as a simple read), the timing of when a lock is escalated to update from update intent, and so on.

I strongly encourage you to read more detail on these race condition problems. You need to get this right.

Example Code

CREATE PROCEDURE dbo.PermitStatusUpdate
   @PermitIDs varchar(8000), -- or (max)
   @Status int
AS
SET NOCOUNT, XACT_ABORT ON -- see note below

BEGIN TRAN

DECLARE @Permits TABLE (
   PermitID int NOT NULL PRIMARY KEY CLUSTERED
)

INSERT @Permits
SELECT Value FROM dbo.Split(@PermitIDs) -- split function of your choice

UPDATE S
SET
   UpdatedOn = GETUTCDATE(),
   Status = @Status
FROM
   PermitStatus S WITH (UPDLOCK, HOLDLOCK)
   INNER JOIN @Permits P ON S.PermitID = P.PermitID

INSERT PermitStatus (
   PermitID,
   UpdatedOn,
   Status
)
SELECT
   P.PermitID,
   GetUTCDate(),
   @Status
FROM @Permits P
WHERE NOT EXISTS (
   SELECT 1
   FROM PermitStatus S
   WHERE P.PermitID = S.PermitID
)

COMMIT TRAN

RETURN @@ERROR;

Note: XACT_ABORT helps guarantee the explicit transaction is closed following a timeout or unexpected error.

To confirm that this handles the locking problem, open several query windows and execute an identical batch like so:

WAITFOR TIME '11:00:00' -- use a time in the near future
EXEC dbo.PermitStatusUpdate @PermitIDs = '123,124,125,126', 1

All of these different sessions will execute the stored procedure in nearly the same instant. Check each session for errors. If none exist, try the same test a few times more (since it's possible to not always have the race condition occur, especially with MERGE).

The writeups at the links I gave above give even more detail than I did here, and also describe what to do for the SQL 2008 MERGE statement as well. Please read those thoroughly to truly understand the issue.

Briefly, with MERGE, no explicit transaction is needed, but you do need to use SET XACT_ABORT ON and use a locking hint:

SET NOCOUNT, XACT_ABORT ON;
MERGE dbo.Table WITH (HOLDLOCK) AS TableAlias
... 

This will prevent concurrency race conditions causing errors.

I also recommend that you do error handling after each data modification statement.

Upvotes: 4

KM.
KM.

Reputation: 103587

create a split function, and use it like:

SELECT
    *
    FROM YourTable  y
    INNER JOIN dbo.splitFunction(@Parameter) s ON y.ID=s.Value

I prefer the number table approach

For this method to work, you need to do this one time table setup:

SELECT TOP 10000 IDENTITY(int,1,1) AS Number
    INTO Numbers
    FROM sys.objects s1
    CROSS JOIN sys.objects s2
ALTER TABLE Numbers ADD CONSTRAINT PK_Numbers PRIMARY KEY CLUSTERED (Number)

Once the Numbers table is set up, create this function:

CREATE FUNCTION [dbo].[FN_ListToTableAll]
(
     @SplitOn  char(1)      --REQUIRED, the character to split the @List string on
    ,@List     varchar(8000)--REQUIRED, the list to split apart
)
RETURNS TABLE
AS
RETURN 
(

    ----------------
    --SINGLE QUERY-- --this WILL return empty rows
    ----------------
    SELECT
        ROW_NUMBER() OVER(ORDER BY number) AS RowNumber
            ,LTRIM(RTRIM(SUBSTRING(ListValue, number+1, CHARINDEX(@SplitOn, ListValue, number+1)-number - 1))) AS ListValue
        FROM (
                 SELECT @SplitOn + @List + @SplitOn AS ListValue
             ) AS InnerQuery
            INNER JOIN Numbers n ON n.Number < LEN(InnerQuery.ListValue)
        WHERE SUBSTRING(ListValue, number, 1) = @SplitOn

);
GO 

You can now easily split a CSV string into a table and join on it:

select * from dbo.FN_ListToTableAll(',','1,2,3,,,4,5,6777,,,')

OUTPUT:

RowNumber   ListValue
----------- ----------
1           1
2           2
3           3
4           
5           
6           4
7           5
8           6777
9           
10          
11          

(11 row(s) affected)  

To make what you need work, do the following:

--this would be the existing table
DECLARE @OldData  table (RowID  int, RowStatus char(1))

INSERT INTO @OldData VALUES (10,'z')
INSERT INTO @OldData VALUES (20,'z')
INSERT INTO @OldData VALUES (30,'z')
INSERT INTO @OldData VALUES (70,'z')
INSERT INTO @OldData VALUES (80,'z')
INSERT INTO @OldData VALUES (90,'z')


--these would be the stored procedure input parameters
DECLARE @IDList      varchar(500)
       ,@StatusList  varchar(500)
SELECT @IDList='10,20,30,40,50,60'
      ,@StatusList='A,B,C,D,E,F'

--stored procedure local variable
DECLARE @InputList  table (RowID  int, RowStatus char(1))

--convert input prameters into a table
INSERT INTO @InputList
        (RowID,RowStatus)
    SELECT
        i.ListValue,s.ListValue
        FROM dbo.FN_ListToTableAll(',',@IDList)            i
            INNER JOIN dbo.FN_ListToTableAll(',',@StatusList)  s ON i.RowNumber=s.RowNumber

--update all old existing rows
UPDATE o
    SET RowStatus=i.RowStatus
    FROM @OldData               o WITH (UPDLOCK, HOLDLOCK) --to avoid race condition when there is high concurrency as per @emtucifor
        INNER JOIN @InputList   i ON o.RowID=i.RowID

--insert only the new rows
INSERT INTO @OldData
        (RowID, RowStatus)
    SELECT
        i.RowID, i.RowStatus
        FROM @InputList               i
            LEFT OUTER JOIN @OldData  o ON i.RowID=o.RowID
        WHERE o.RowID IS NULL

--display the old table
SELECT * FROM @OldData order BY RowID

OUTPUT:

RowID       RowStatus
----------- ---------
10          A
20          B
30          C
40          D
50          E
60          F
70          z
80          z
90          z

(9 row(s) affected)

EDIT thanks to @Emtucifor click here for the tip about the race condition, I have included the locking hints in my answer, to prevent race condition problems when there is high concurrency.

Upvotes: 4

David Hall
David Hall

Reputation: 33143

You should be able to do your insert and your update as two set based queries.

The code below was based on a data load procedure that I wrote a while ago that took data from a staging table and inserted or updated it into the main table.

I've tried to make it match your example, but you may need to tweak this (and create a table valued UDF to parse your CSV into a table of ids).

-- Update where the join on permitstatus matches
Update
    PermitStatus
Set 
    [UpdatedOn]=GETUTCDATE(),
    [Status]=staging.Status
From 
    PermitStatus status
Join   
    StagingTable staging
On
    staging.PermitId = status.PermitId

-- Insert the new records, based on the Where Not Exists      
Insert 
    PermitStatus(Updatedon, Status, PermitId)
Select (GETUTCDATE(), staging.status, staging.permitId
From 
     StagingTable staging
Where Not Exists
(
    Select 1 from PermitStatus status
    Where status.PermitId = staging.PermidId 
)   

Upvotes: 2

user35559
user35559

Reputation: 1048

Essentially you have an upsert stored procedure (eg. UpsertSinglePermit)
(like the code you have given above) for dealing with one row.

So the steps I see are to create a new stored procedure (UpsertNPermits) which does

a) Parse input string into n record entries (each record contains permit id and status) b) Foreach entry in above, invoke UpsertSinglePermit

Upvotes: -1

Lucero
Lucero

Reputation: 60190

If you have SQL Server 2008, you can use MERGE. Here's an article describing this.

Upvotes: 2

Related Questions