Reputation: 1027
I am using SQL Server 2008 and running the following stored procedure that needs to "clean" a 70 mill table from about 50 mill rows to another table, the id_col
is integer
(primary identity key)
According to the last running I made it is working good but it is expected to last for about 200 days:
SET NOCOUNT ON
-- define the last ID handled
DECLARE @LastID integer
SET @LastID = 0
declare @tempDate datetime
set @tempDate = dateadd(dd,-20,getdate())
-- define the ID to be handled now
DECLARE @IDToHandle integer
DECLARE @iCounter integer
DECLARE @watch1 nvarchar(50)
DECLARE @watch2 nvarchar(50)
set @iCounter = 0
-- select the next to handle
SELECT TOP 1 @IDToHandle = id_col
FROM MAIN_TABLE
WHERE id_col> @LastID and DATEDIFF(DD,someDateCol,otherDateCol) < 1
and datediff(dd,someDateCol,@tempDate) > 0 and (some_other_int_col = 1745 or some_other_int_col = 1548 or some_other_int_col = 4785)
ORDER BY id_col
-- as long as we have s......
WHILE @IDToHandle IS NOT NULL
BEGIN
IF ((select count(1) from SOME_OTHER_TABLE_THAT_CONTAINS_20k_ROWS where some_int_col = @IDToHandle) = 0 and (select count(1) from A_70k_rows_table where some_int_col =@IDToHandle )=0)
BEGIN
INSERT INTO SECONDERY_TABLE
SELECT col1,col2,col3.....
FROM MAIN_TABLE WHERE id_col = @IDToHandle
EXEC [dbo].[DeleteByID] @ID = @IDToHandle --deletes the row from 2 other tables that is related to the MAIN_TABLE and than from the MAIN_TABLE
set @iCounter = @iCounter +1
END
IF (@iCounter % 1000 = 0)
begin
set @watch1 = 'iCounter - ' + CAST(@iCounter AS VARCHAR)
set @watch2 = 'IDToHandle - '+ CAST(@IDToHandle AS VARCHAR)
raiserror ( @watch1, 10,1) with nowait
raiserror (@watch2, 10,1) with nowait
end
-- set the last handled to the one we just handled
SET @LastID = @IDToHandle
SET @IDToHandle = NULL
-- select the next to handle
SELECT TOP 1 @IDToHandle = id_col
FROM MAIN_TABLE
WHERE id_col> @LastID and DATEDIFF(DD,someDateCol,otherDateCol) < 1
and datediff(dd,someDateCol,@tempDate) > 0 and (some_other_int_col = 1745 or some_other_int_col = 1548 or some_other_int_col = 4785)
ORDER BY id_col
END
Any ideas or directions to improve this procedure run-time will be welcomed
Upvotes: 1
Views: 483
Reputation: 146499
Yes, try this:
Declare @Ids Table (id int Primary Key not Null)
Insert @Ids(id)
Select id_col
From MAIN_TABLE m
Where someDateCol >= otherDateCol
And someDateCol < @tempDate -- If there are times in these datetime fields,
-- then you may need to modify this condition.
And some_other_int_col In (1745, 1548, 4785)
And Not exists (Select * from SOME_OTHER_TABLE_THAT_CONTAINS_20k_ROWS
Where some_int_col = m.id_col)
And Not Exists (Select * From A_70k_rows_table
Where some_int_col = m.id_col)
Select id from @Ids -- this to confirm above code generates the correct list of Ids
return -- this line to stop (Not do insert/deletes) until you have verified @Ids is correct
-- Once you have verified that above @Ids is correctly populated,
-- then delete or comment out the select and return lines above so insert runs.
Begin Transaction
Delete OT -- eliminate row-by-row call to second stored proc
From OtherTable ot
Join MAIN_TABLE m On m.id_col = ot.FKCol
Join @Ids i On i.Id = m.id_col
Insert SECONDERY_TABLE(col1, col2, etc.)
Select col1,col2,col3.....
FROM MAIN_TABLE m Join @Ids i On i.Id = m.id_col
Delete m -- eliminate row-by-row call to second stored proc
FROM MAIN_TABLE m
Join @Ids i On i.Id = m.id_col
Commit Transaction
Explaanation.
You had numerous filtering conditions that were not SARGable, i.e., they would force a complete table scan for every iteration of your loop, instead of being able to use any existing index. Always try to avoid filter conditions that apply processing logic to a table column value before comparing it to some other value. This eliminates the opportunity for the query optimizer to use an index.
You were executing the inserts one at a time... Way better to generate a list of PK Ids that need to be processed (all at once) and then do all the inserts at once, in one statement.
Upvotes: 3