Reputation: 6514
I am fairly new to locks and hints.
I have a table with very frequent SELECT
and INSERT
operations. The table has 11 million records.
I have added a new column to it and I need to copy over the data from an existing column in the same table to the new column.
I am planning to use ROWLOCK
hint to avoid escalating locks to table level locks and blocking out all other operations on the table. For example:
UPDATE
SomeTable WITH (ROWLOCK)
SET
NewColumn = OldColumn
Questions:
NOLOCK
instead of ROWLOCK
? Note, once the records are inserted in the table, the value for OldColumn does not change, so NOLOCK
would not cause dirty reads.NOLOCK
even make sense in this case, because the SQL Server would have to anyways get update locks for UPDATE
.I know hints are to be avoided and SQL Server usually makes smarter choices, but I don't want to get the table locked out during this update.
Upvotes: 17
Views: 49414
Reputation: 1931
Try to update in batches.
DECLARE @Batch INT = 1000
DECLARE @Rowcount INT = @Batch
WHILE @Rowcount > 0
BEGIN
;WITH CTE AS
(
SELECT TOP (@Batch) NewColumn,OldColumn
FROM SomeTable
WHERE NewColumn <> OldColumn
OR (NewColumn IS NULL AND OldColumn IS NOT NULL)
)
UPDATE cte
SET NewColumn = OldColumn;
SET @Rowcount = @@ROWCOUNT
END
Upvotes: 19
Reputation: 1535
We recently had a case where we wanted to do something similar, but slowly over days (updating only a certain number of records per run, and only during certain hours). The most recent data was fine, but millions of rows of older data needed to be updated. Our data table looks something like this:
Create Table FileContent
(
FileContent varchar(max),
File_PK bigint,
NewFileContent varchar(max)
)
And we only needed to update certain rows, but millions of them. We created a table to store our progress so we could use a scheduled job to iterate through and update the main table, then populated this table with the primary keys of the main table records that needed updating:
Create Table FilesToUpdate
(
File_PK bigint,
IsUpdated bit NOT NULL DEFAULT 0
)
Then we scheduled the following script to do the updating (for your own use, play with the batch sizing and scheduling for what works with your system).
/***
Script to update and fix records.
***/
DECLARE @Rowcount INT = 1 --
, @BatchSize INT = 100 -- how many rows will be updated on each iteration of the loop
, @BatchesToRun INT = 25 -- the max number of times the loop will iterate
, @StartingRecord BIGINT = 1;
-- Get the highest File_PK not already fixed as a starting point.
Select @StartingRecord = MAX(File_PK) From FilesToUpdate where IsUpdated = 0
-- While there are still rows to update and we haven't hit our limit on iterations...
WHILE (@Rowcount > 0 and @BatchesToRun > 0)
BEGIN
print Concat('StartingRecord (Start of Loop): ', @StartingRecord)
UPDATE FileContent SET NewFileContent = 'New value here'
WHERE File_PK BETWEEN (@StartingRecord - @BatchSize + 1) AND @StartingRecord;
-- @@Rowcount is the number of records affected by the last statement. If this returns 0, the loop will stop because we've run out of things to update.
SET @Rowcount = @@ROWCOUNT;
print Concat('RowCount: ', @Rowcount)
-- Record which PKs were updated so we know where to start next time around.
UPDATE FilesToUpdate Set IsUpdated = 1 where File_PK BETWEEN (@StartingRecord - @BatchSize + 1) AND @StartingRecord;
-- The loop will stop after @BatchSize*@BatchesToRun records are updated.
-- If there aren't that many records left to update, the @Rowcount checks will stop it.
SELECT @BatchesToRun = @BatchesToRun - 1
print Concat('Batches Remaining: ',@BatchesToRun)
-- Set the starting record for the next time through the loop.
SELECT @StartingRecord -= @BatchSize
print Concat('StartingRecord (End of Loop): ', @StartingRecord)
END
Upvotes: 0
Reputation: 6514
I took @pacreely's approach (see his answer on this question) of updating in batches and created a update...top
variation. I added the (rowlock) hint tell SQL server to keep the locks at row level.
See update...top for details. Also note, that you cannot use order by
when using top
in update
, insert
, merge
, delete
statement so the referenced rows are not arranged in any order.
declare @BatchSize int = 1000
declare @RowCount int = @BatchSize
while @RowCount > 0
begin
update top (@BatchSize) SomeTable with (rowlock)
set NewColumn = OldColumn
where
NewColumn <> OldColumn or
(
NewColumn is null and
OldColumn is not null
)
select @RowCount = @@rowcount
end
Upvotes: 8