Reputation: 35374
I have a table with this column (among others):
id int identity(1,1) not null
I've added a column to store the approximate date the record was inserted:
insertdate smalldatetime null
I've filled in the insertdate
where I can using the earliest reference found in forensic searches of related tables and logs. This does, however, leave a number of NULL "gaps" in the data, as well as situations where a record with a lower ID number has a more recent insertdate
value than subsequent ID values.
The identity
attribute provides an adequate basis to assume that a record must have been created before any record with a higher ID
value, so I've decided to update the insertdate
for any record where it is null or a subsequent ID has an earlier date:
UPDATE
table
SET
insertdate = (SELECT MIN(insertdate)
FROM table t2
WHERE
t2.id >= table.id
AND t2.insertdate IS NOT NULL
)
Unfortunately, updating like this is eating the server's lunch... 1 hour and counting for 2.5 million records.
Any ideas for how to do this more efficiently?
It only needs to be done once, but this is a production server, so I'd prefer to not lock up the table for any longer than necessary.
Upvotes: 0
Views: 70
Reputation: 45096
Not sure this will help but you don't need to test for null.
Min() will not consider nulls.
UPDATE
table
SET
insertdate = (SELECT MIN(t2.insertdate)
FROM table t2
WHERE t2.id >= table.id
AND t2.ID < table.id + 10000)
Could you limit to the next x rows?
Is there a point at which you are pretty sure you are not going to find a smaller date?
And you could limit table and t2 to the ID of the first row when you set a default on date.
Might fill up the transaction log and roll the whole thing back.
If that happens just be patient and let it roll back.
If you abort it now it is going to have to roll back.
Breaking up in batches of 100,000 is going to let the transaction log clear.
Upvotes: 2