richardtallent
richardtallent

Reputation: 35374

How can I efficiently update a table based on other rows?

I have a table with this column (among others):

id int identity(1,1) not null

I've added a column to store the approximate date the record was inserted:

insertdate smalldatetime null

I've filled in the insertdate where I can using the earliest reference found in forensic searches of related tables and logs. This does, however, leave a number of NULL "gaps" in the data, as well as situations where a record with a lower ID number has a more recent insertdate value than subsequent ID values.

The identity attribute provides an adequate basis to assume that a record must have been created before any record with a higher ID value, so I've decided to update the insertdate for any record where it is null or a subsequent ID has an earlier date:

UPDATE
table
SET
insertdate = (SELECT MIN(insertdate) 
      FROM table t2
      WHERE
        t2.id >= table.id 
        AND t2.insertdate IS NOT NULL
      )

Unfortunately, updating like this is eating the server's lunch... 1 hour and counting for 2.5 million records.

Any ideas for how to do this more efficiently?

It only needs to be done once, but this is a production server, so I'd prefer to not lock up the table for any longer than necessary.

Upvotes: 0

Views: 70

Answers (1)

paparazzo
paparazzo

Reputation: 45096

Not sure this will help but you don't need to test for null.
Min() will not consider nulls.

UPDATE
table 
SET
insertdate = (SELECT MIN(t2.insertdate) 
                FROM table t2
               WHERE t2.id >= table.id 
                 AND t2.ID < table.id + 10000)  

Could you limit to the next x rows?
Is there a point at which you are pretty sure you are not going to find a smaller date?

And you could limit table and t2 to the ID of the first row when you set a default on date.

Might fill up the transaction log and roll the whole thing back.
If that happens just be patient and let it roll back.
If you abort it now it is going to have to roll back.
Breaking up in batches of 100,000 is going to let the transaction log clear.

Upvotes: 2

Related Questions