Ice
Ice

Reputation: 1172

How-To delete 8,500,000 Records from one table on sql server

delete activities where unt_uid is null

would be the fastest way but nobody can access the database / table until this statement has finished so this is a no-go.

I defined a cursor to get this task done during working time but anyway the impact to productivity is to big. So how to delete these record so that the normal use of this database is guaranteed?

It's a SQL-2005 Server on a 32-bit Win2003. Second Question is: How Long would you estimate for this job to be done (6 hours or 60 hours)? (Yes, i know that depends on the load but assume that this is a small-business environment)

Upvotes: 9

Views: 6158

Answers (6)

Carla Abanes
Carla Abanes

Reputation: 1

i would create a task for this and schedule it to run during offpeak hours. But i would not suggest you to delete in the table being used. Move the rows you want to keep to new table and totally drop the current table with lots of rows you want to delete.

Upvotes: 0

Jonas Lincoln
Jonas Lincoln

Reputation: 9767

I'd use the "nibbling delete" technique. From http://sqladvice.com/blogs/repeatableread/archive/2005/09/20/12795.aspx:

DECLARE @target int
SET @target = 2000
DECLARE @count int
SET @count = 2000

WHILE @count = 2000 BEGIN

 DELETE FROM myBigTable
 WHERE targetID IN
 (SELECT TOP (@target) targetID
  FROM myBigTable WITH(NOLOCK)
  WHERE something = somethingElse) 

 SELECT @count = @@ROWCOUNT
 WAITFOR DELAY '000:00:00.200'

END

I've used it for exactly this type of scenario. The WAITFOR is important to keep, it allows other queries to do their work in between deletes.

Upvotes: 2

Cade Roux
Cade Roux

Reputation: 89711

In a small-business environment, it seems odd that you would need to delete 500,000 rows in standard operational behavior without affecting any other users. Typically for deletes that large, we're making a new table and using TRUNCATE/INSERT or sp_rename to overwrite the old one.

Having said that, in a special case, one of my monthly processes regularly can delete 200m rows in batches of around 3m at a time if it detects that it needs to re-run the process which generated those 200m rows. But this is a single-user process in a dedicated data warehouse database, and I wouldn't call it a small-business scenario.

I second the answers recommending seeking alternative approaches to your design.

Upvotes: 0

MarkR
MarkR

Reputation: 63576

Who can access the table will depend on your transaction isolation mode, I'd guess.

However, you're broadly right - lots of deletes is bad, particularly if your where clause means it cannot use an index - this means the database probably won't be able to lock only the rows it needs to delete, so it will end up taking a big lock on the whole table.

My best recommendation would be to redesign your application so you don't need to delete these rows, or possibly any rows.

You can either do this by partitioning the table such that you can simply drop partitions instead, or use the "copy the rows you want to keep then drop the table" recipe suggested by others.

Upvotes: 3

Ray
Ray

Reputation: 21905

Perhaps instead of deleting the records from your table, you could create a new identical table, insert the records you want to keep, and then rename the tables so the new one replaces the old one. This would still take some time, but the down-time on your site would be pretty minimal (just when swapping the tables)

Upvotes: 5

Keltex
Keltex

Reputation: 26436

You can do it in chunks. For example, every 10 seconds execute:

delete from activities where activityid in 
  (select top 1000 activityid from activities where unt_uid is null)

Obviously define the row count (I arbitrarily picked 1000) and interval (I picked 10 seconds) which makes the most sense for your application.

Upvotes: 9

Related Questions