backtrack
backtrack

Reputation: 8154

which one is best practice ? bulk delete or top 10000 rows

Im in the need to delete 6.3 million records from 25 Tables in the production. I have the following two query i dont know which one is best ?

delete  jc 
from Jtable jc
join J1Table j 
on j.JobId = jc.JobId 
join nonJobReProcess n
on n.jobid = j.JobId

or

while exists (select top 1 * from dbo.Jtable jc (nolock)
join J1Table j (nolock)
on j.JobId = jc.JobId 
join nonJobReProcess  n  (nolock)
on n.jobid = j.JobId)
begin 

delete  top (10000) jc 
from dbo.Jtable jc (nolock)
join J1Table j (nolock)
on j.JobId = jc.JobId 
join nonJobReProcess n  (nolock)
on n.jobid = j.JobId

end 

Im using sql server 2008 R2 . And i need some explanation too . Any one can help me

Upvotes: 1

Views: 4862

Answers (4)

Kahn
Kahn

Reputation: 1660

One point to consider here, is that on production you may not be able to set the recovery mode on simple. If so, then I'd recommend looking further into how much stuff you're going to be deleting, and whether or not it can cause issues filling out the tempdb or log files.

I recently created a similar obscenely large deletion process that would fill up tempdb and log several times over, so instead I created it into a job which would check the current space usage after every batch and abort when the remaining space was approaching a level where production could still run comfortably until the next backup. Then pick up from there the next time.

Below is a sample of how to gather reserved and available spaces from tempdb and log, not sure of it's reliability as a whole but it's worked thus far in SQL Server 2008 / 2012. Of course, that won't tell you how much space your SQL Server's HDD has to grow the reserved log / tempdb space (that I just found out the old fashioned way, and went ahead and looked).

DECLARE @LOGSTATS TABLE (DBNAME VARCHAR(256), LOGSIZE_MB DEC (32,2)
    , [LOGSPACEUSED_%] DEC(32,2), STATUS BIT)

INSERT INTO @LOGSTATS
EXEC('DBCC SQLPERF(logspace) WITH NO_INFOMSGS')

SELECT DB_NAME() DBName
    , CAST((SELECT (LOGSIZE_MB-(LOGSIZE_MB*([LOGSPACEUSED_%]/100)))/1024
        FROM @LOGSTATS WHERE DBNAME = DB_NAME()) AS DEC(18,2))
    AS Log_free_GB
    , (SELECT CAST((CAST((size*8) AS DEC(18,2))/1024)/1024 AS DEC(18,2)) Log_reserved_GB
        FROM sys.master_files WHERE DB_NAME(database_id) = DB_NAME() AND type_desc = 'LOG')
    AS Log_reserved_GB
    , (SELECT CAST((SUM(unallocated_extent_page_count)*1.0/128)/1024 AS DEC(18,2)) AS FreeSpaceGB
        FROM tempdb.sys.dm_db_file_space_usage)
    AS Tempdb_free_GB
    , (SELECT CAST(SUM(size*1.0/128)/1024 AS DEC(18,2)) TDspace
        FROM tempdb.sys.database_files WHERE type_desc = 'ROWS')
    AS Tempdb_reserved_GB

Basically, if you can assume that the production tempdb and log space have grown to what they comfortably support, you could for instance put your deletion batch into a while loop. On the while loop, your condition is to keep going until available space is less than for example 50 GB (depends on your DB and environment of course).

This is a customized solution of course, so you need to find out how it works for your scenario.

Upvotes: 1

StuartLC
StuartLC

Reputation: 107327

The issue may be more about overall concurrency of the database during the delete - it is possible that the bulk delete is part of a background cleanup process, and the impact of the delete on the overall system is more important than performance of the bulk delete itself.

At approximately 5000 locks, SqlServer will start considering escalation to Table Locks, which will then block concurrent writers to the table (e.g. inserters) until your deletion is complete. This is a scenario considered in a looped batch delete.

Another consideration is logging. If you have simple recovery mode, then batching the deletes will keep the log size down, although you should add a CHECKPOINT after each batch.

In general however, the single line delete command should be the default approach for the performance of the delete, unless concurrency is important.

Also, there is no point using the nolock optimiser hint on the table to be deleted - deletion requires exclusive locks.

delete  top (10000) jc 
from dbo.Jtable jc -- (nolock) - no point.

Upvotes: 2

user3110088
user3110088

Reputation: 1

I understand your problem, since i had been through it once. The following link might help you too.

Please have a look at it

Upvotes: 0

Dave C
Dave C

Reputation: 7402

If you try to delete 6.3M rows in one batch, it has to all be processed, and logged before it can be committed. Your log files will grow quickly, and hurt performance. Best practice here is to delete in batches. If you're on a production server, it might be nice to add a waitfor to give the server some time to breathe.

WAITFOR DELAY '00:00:10'

Also, NOLOCK hints are ignored on delete operations.

Here's a loop that will go until completed.

DECLARE @BATCHCOUNT INT
SET @BATCHCOUNT = 10000 -- SET INITIAL COUNT HERE...
WHILE @BATCHCOUNT > 0
BEGIN
   DELETE TOP(@BATCHCOUNT) FROM MYTABLE
   SET @BATCHCOUNT = @@ROWCOUNT
   WAITFOR DELAY '00:00:05'
END

Upvotes: 2

Related Questions