Kushal Karia
Kushal Karia

Reputation: 97

How to delete large volume of data from SQL Server based on multiple condition

I am looking for a SQL query where I have to delete data from a table based on combination of 2 condition.

I have a table dbo.ABC from which I have to delete the rows if combination of values pass as input present in column A and column B.

Table - ABC

column A   column B
===========================
100           US
200           IND

Now I have to delete record if column A is 100 and column B is US. But I want to pass multiple inputs like 100 and 200 as column A and US and IND as column B respectively. 100 and US will be the one combination to delete 1st record and 200 and IND will be the another combination to delete 2 record. Similarly I want to pass 1000 input like this way to delete 1000 records in single shot rather than calling it 1000 times with single input.

Can I use multiple IN statement to delete the data as I have multiple combination of column_A and column_B?

How to use for-loop to delete the data if I pass many combination of column_A and column_B?

I am new to this field so not have much knowledge.

Your help will be appreciated..

Thank you.

Upvotes: 1

Views: 3105

Answers (1)

O. Jones
O. Jones

Reputation: 108796

Please note: I have not debugged this. But I have used similar stuff for large-scale purge operations on a busy transactional database.

Create yourself a temporary table and populate it with the combinations you want to delete. For example,

CREATE TABLE #to_delete ( cola INT NOT NULL, colb NVARCHAR(200) NOT NULL);
GO
INSERT INTO #to_delete (cola, colb) VALUES
(100, N'US'),
(200, N'CN'),
(300, N'UK');
GO

Then do this to test your row-selection logic. This should only return the rows you want to delete.

SELECT TOP(500) ABC.*
  FROM ABC
  JOIN #to_delete d ON ABC.col1 = d.col1 AND ABC.col2 = d.col2;

Once you're satisfied this gets only the doomed rows, then do a so-called nibbling delete. Repeat the delete query in a loop until it doesn't delete anything. Like this.

DECLARE @count INT = 1;
WHILE @count > 0 BEGIN
    WAITFOR DELAY '00:00:02';
    BEGIN TRANSACTION;
      SET DEADLOCK_PRIORITY LOW;
      DELETE TOP(500) ABC
        FROM ABC
        JOIN #to_delete d ON ABC.col1 = d.col1 AND ABC.col2 = d.col2;
      SET @count = @@ROWCOUNT;
    COMMIT TRANSACTION;
END;

Why is this a good way to go?

  1. It deletes a small chunk of rows with each query, minimizing both the lock time for rows in the table and the size of each delete transaction.
  2. The low deadlock priority is just insurance. If one of these DELETE operations deadlocks with some other query, possibly on some index, we definitely want to kill the DELETE operation, not the other query. We can just repeat the DELETE query.
  3. The short delay between chunks allows your transactional workload to get time to run.

Upvotes: 2

Related Questions