bwilliamson
bwilliamson

Reputation: 391

Purge job optimization

SQL Server 2008 R2 Enterprise

I have a database with 3 tables that I am keeping a retention time of 15 days. This is a logging database that is very active and about 500 GB in size and eats about 30GB a day unless purged. I can't seem to get caught up on one of the tables and I am falling behind. This table has 220 million rows and it needs to purge around 10-12 million rows nightly. I am currently at 30 million rows needed to purge. I can only run this purge at night due to the volume of incoming inserts competing for table locks. I have confirmed that everything is indexed correctly and have run Brent Ozars sp_Blitz_Index just to confirm that. Is there any way to optimize what I am doing below? I am running the same purge steps for each table.

  1. Drop and Create 3 purge tables: Purge_Log, Purge_SLogHeader and Purge_SLogMessage.

2.Insert rows into the purge tables (Takes 5 minutes each table):

Insert Into Purge_Log
Select ID from ServiceLog 
where startTime <  dateadd (day, -15, getdate()  )

--****************************************************

Insert into Purge_SLogMessage
select serviceLogId from ServiceLogMessage 
where serviceLogId in ( select id from 
                       ServiceLog 
                       where startTime <  dateadd (day, -15, getdate() ))

--****************************************************

Insert into Purge_SLogHeader
Select serviceLogId from ServiceLogHeader
where serviceLogId in ( select id from 
                       ServiceLog 
                       where startTime <  dateadd (day, -15, getdate()  ))

After that is inserted, then I run the following with differences for each table:

SET ROWCOUNT 1000

delete_more:
delete from ServiceLog
where Id in ( select Id from Purge_Log)

IF @@ROWCOUNT > 0 GOTO delete_more
SET ROWCOUNT 0     

Basically does anyone see a way that I can make this procedure run faster or have a different way to go about it. I've made the queries as simple as possible and with only one subquery. I've used a join and the execution query plan says the time is the same to complete it that way. Any guidance would be appreciated.

Upvotes: 1

Views: 707

Answers (1)

Shushil Bohara
Shushil Bohara

Reputation: 5656

You can use this technique for all the tables, collect IDs first in temporary table to avoid scanning original table again and again in huge data. I hope it will work perfectly for you all the tables:

DECLARE @del_query VARCHAR(MAX)

/* Taking IDs from ServiceLog table instead of Purge_Log because Purge_Log may have more data than expected because of frequent purging */

IF OBJECT_ID('tempdb..#tmp_log_ids') IS NOT NULL DROP TABLE #tmp_log_ids
SELECT ID INTO #tmp_log_ids FROM ServiceLog WHERE startTime < DATEADD(DAY, -15, GETDATE())

SET @del_query ='
DELETE TOP(100000) sl
FROM ServiceLog sl 
INNER JOIN #tmp_log_ids t ON t.id = s1.id'
WHILE 1 = 1
BEGIN
    EXEC(@del_query + ' option(maxdop 5) ')
    IF @@rowcount < 100000 BREAK;
END 

SET @del_query ='
DELETE TOP(100000) sl
FROM ServiceLogMessage sl 
INNER JOIN #tmp_log_ids t ON t.id = s1.serviceLogId'
WHILE 1 = 1
BEGIN
    EXEC(@del_query + ' option(maxdop 5) ')
    IF @@rowcount < 100000 BREAK;
END

SET @del_query ='
DELETE TOP(100000) sl
FROM ServiceLogHeader sl 
INNER JOIN #tmp_log_ids t ON t.id = s1.serviceLogId'
WHILE 1 = 1
BEGIN
    EXEC(@del_query + ' option(maxdop 5) ')
    IF @@rowcount < 100000 BREAK;
END

Upvotes: 2

Related Questions