Susan
Susan

Reputation: 198

How to mimic "Bulk" Insert with DataTable in SQL Server

I have a stored procedure where I send in an user defined type which is a table. I have simplified to make it easier to read.

 CREATE TYPE [dbo].[ProjectTableType] AS TABLE(
     [DbId] [uniqueidentifier] NOT NULL,
     [DbParentId] [uniqueidentifier] NULL,
     [Description] [text] NULL
 )

 CREATE PROCEDURE [dbo].[udsp_ProjectDUI] (@cmd varchar(10), 
      @tblProjects ProjectTableType READONLY) AS BEGIN
      DECLARE @myNewPKTable TABLE (myNewPK uniqueidentifier)

      IF(LOWER(@cmd) = 'insert')
      BEGIN
          INSERT INTO 
            dbo.Project 
            (
                DbId,
                DbParentId,
                Description)
        OUTPUT INSERTED.DbId INTO @myNewPKTable
        SELECT NEWID(),
                DbParentId,
                Description
        FROM @tblProjects;

        SELECT * FROM dbo.Project WHERE dbid IN (SELECT myNewPK FROM @myNewPKTable);
  END

This is for a DLL that other applications will use so we aren't in charge on validation necessarily. I want to mimic BULK INSERT where if one rows fails to insert but the other rows are fine, the correct ones will still insert. Is there a way to do this? I want to do it for UPDATE as well where if one fails, the stored procedure will continue to try updating the others.

The only option I can think of is to only do one at a time (either a loop in the code where the stored proc is called multiple times or a loop in the stored procedure), but was wondering what the performance hit would be for that or if there is a better solution.

Upvotes: 0

Views: 128

Answers (1)

Xedni
Xedni

Reputation: 4695

Not sure which errors you're wanting to continue on, but unless you're running into all manner of unexpected errors, I'd try to avoid devolving into RBAR just yet.

Check Explicit Violations

The main thing I would think would be PK volations which you can avoid by just checking for existence before insert (and update). If there are other business logic failure conditions, you can check those here as well.

insert into dbo.Project 
(
    DbId,
    DbParentId,
    Description
)
output insert.DbId 
into @myNewPKTable (DbId)
select
    DbId = newid(),
    DbParentId = s.DbParentId,
    Description = s.Description
from @tblProjects s -- source
-- LOJ null check makes sure we don't violate PK
-- NOTE: I'm pretending this is an alternate key of the table. 
left outer join dbo.Project t -- target
    on s.dbParentId = t.dbParentId 
where t.dbParentId is null

If at all possible, I'd try to stick with a batch update, and use join predicates to eliminate the possibility of most errors you expect to see. Changing to RBAR processing because you're worried you "might" get a system shutdown failure is probably a waste of time. Then, if you hit a really nasty error you can't recover from, fail the batch, legitimately.

RBAR

Alternatively, if you absolutely need row-by-row granularity of successes or failures, you could do try/catch around each statement and have the catch block do nothing (or log something).

declare 
    @DBParentId int, 
    @Description nvarchar(1000),
    @Ident int

declare c cursor local fast_forward for
    select
        DbParentId = s.DbParentId,
        Description = s.Description
    from @tblProjects
open c

fetch next from c into @DBParentId, @Description

while @@fetch_status = 0
begin

    begin try

        insert into dbo.Project 
        (
            DbId,
            DbParentId,
            Description
        )
        output insert.DbId 
        into @myNewPKTable (DbId)
        select
            newid(),
            @DBParentId,
            @Description

    end try
    begin catch
        -- log something if you want
        print error_message()
    end catch

    fetch next from c into @DBParentId, @Description

end

Hybrid You might be able to get clever and hybridize things. One option might be to make the web-facing insert procedure actually insert into a lightweight, minimally keyed/constrainted table (like a queue). Then, every minute or so, have an Agent job run through the logged calls, and operate on them in a batch. Doesn't fundamentally change any of the patterns here, but it makes the processing asynchronous so the caller doesn't have to wait, and by batching requests together, you can save processing power piggybacking on what SQL does best; set based ops.

Another option might be to do the most set-based processing you can (using checks to prevent business rule or constraint violations). If anything fails, you could then spin off an RBAR process for the remaining rows. If all succeeds though, that RBAR process is never hit.

Conclusion Several ways to go about this. I'd try to use set-based operations as much as possible unless you have a really strong reason for needing row-by-row granularity.

  • You can avoid most errors just by constructing your insert/update statement correctly.

  • If you need to, you can use a try/catch with an "empty" catch block so failures don't stop the total processing

  • Depending on random odds and ends of your specific situation, you might want or need to hybridize these two approaches

Upvotes: 2

Related Questions