master_cylinder
master_cylinder

Reputation: 85

copying data out of a huge production table

The company I'm working for has some huge log/journal tables that are being written to every 10 seconds or so with transactions. I'd like to copy out a lot of the data in this table and delete it from that table because about 75% of the old data could be put in an archive table or something, but if I do this wrong and the table gets locked up it would be a disaster.

In a previous question a guy came up with something like this, I'd like to know that this wouldn't screw everything up, is the nolock hint enough to keep me safe and all the writes working just fine? If not, what should I do?

set identity_insert newtable on
DECLARE @StartID bigint, @LastID bigint, @EndID bigint
select @StartID = max(id)+1
from newtable

select @LastID = max(ID)
from oldtable

while @StartID < @LastID
begin
set @EndID = @StartID + 1000000

insert into newtable (FIELDS,GO,HERE)
select FIELDS,GO,HERE from oldtable (NOLOCK)
where id BETWEEN @StartID AND @EndId

set @StartID = @EndID + 1
end
set identity_insert newtable off
go

Upvotes: 5

Views: 224

Answers (2)

JNK
JNK

Reputation: 65157

The extreme caution in generating your list is probably overboard, but you will want to run a batch delete.

For the INSERT, you probably don't need the WHILE loop. For the DELETE, though, I would use something like this (tune the batch size to your needs):

WHILE 1=1
BEGIN
   DELETE TOP (10000) o
   FROM OldTable o
   INNER JOIN NewTable N
      ON o.id = n.id
   IF @@ROWCOUNT < 10000 BREAK;
END

This will DELETE 10k records at a time as long as there are records to delete.

Upvotes: 3

sazh
sazh

Reputation: 1832

One option is to partition the table by hour (assuming you have a DATETIME column in the table that defaults to GETDATE() on every insert). Having the partitions allows you to perform maintenance (drop, copy, etc.) on the older partitions without affecting the current one.

Upvotes: 0

Related Questions