T McKeown
T McKeown

Reputation: 12857

Efficient DELETE TOP?

Is it more efficient and ultimately FASTER to delete rows from a DB in blocks of 1000 or 10000? I am having to remove approx 3 million rows from many tables. I first did the deletes in blocks of 100K rows but the performance wasn't looking good. I changed to 10000 and seem to be removing faster. Wondering if even smaller like 1K per DELETE statement is even better.

Thoughts?

I am deleting like this:

DELETE TOP(10000)
FROM TABLE
WHERE Date < '1/1/2012'

Upvotes: 1

Views: 805

Answers (3)

TomTom
TomTom

Reputation: 62127

Yes, it is. It all depends on your server though. I mean, last time I did that i was using this approeach to delete things in 64 million increments (on a table that had at that point around 14 billion rows, 80% Of which got ultimately deleted). I got a delete through every 10 seconds or so.

It really depends on your hardware. Going more granular is more work but it means less waiting for tx logs for other things operating on the table. You have to try out and find where you are comfortable - there is no ultimate answer because it is totally dependent on usage of the table and hardware.

Upvotes: 2

Paul
Paul

Reputation: 1066

Yes, no, it depends on the usage of table due to locking. I would try to delete the records in a slower pace. So the opposite of the op's question.

set rowcount 10000
while @@rowcount > 0
begin
    waitfor delay '0:0:1'      
    delete 
    from table 
    where date < convert(datetime, '20120101', 112)
end
set rowcount 0

Upvotes: 1

SteveB
SteveB

Reputation: 799

We used Table Partitioning to remove 5 million rows in less than a sec but this was from just one table. It took some work up-front but ultimately was the best way. This may not be the best way for you.

From our document about partitioning:

Let’s say you want to add 5 million rows to a table but don’t want to lock the table up while you do it. I ran into a case in an ordering system where I couldn’t insert the rows without stopping the system from taking orders. BAD! Partitioning is one way of doing it if you are adding rows that don’t overlap current data.

WHAT TO WATCH OUT FOR:

  • Data CANNOT overlap current data. You have to partition the data on a value. The new data cannot be intertwined within the currently partitioned data. If removing data, you have to remove an entire partition or partitions. You will not have a WHERE clause.

  • If you are doing this on a production database and want to limit the locking on the table, create your indexes with “ONLINE = ON”.

OVERVIEW OF STEPS:

  • FOR ADDING RECORDS

    • Partition the table you want to add records to (leave a blank partition for the new data). Do not forget to partition all of your indexes.
    • Create new table with the exact same structure (keys, data types, etc.).
    • Add a constraint to the new table to limit that data so that it would fit into the blank partition in the old table.
    • Insert new rows into new table.
    • Add indexes to match old table.
    • Swap the new table with the blank partition of the old table.
    • Un-partition the old table if you wish.
  • FOR DELETING RECORDS

    • Partition the table into sets so that the data you want to delete is all on partitions by itself (this could be many different partitions).
    • Create a new table with the same partitions.
    • Swap the partitions with the data you want to delete to the new table.
    • Un-partition the old table if you wish.

Upvotes: 2

Related Questions