Reputation: 31
I use following sql query to update MyTable. the code take between 5 to 15 min. to update MyTabel as long as ROWS <= 100000000 but when Rows > 100000000 it take exponential time to update MYTable. How can I change this code to use set-base instead of while loop?
DECLARE @startTime DATETIME
DECLARE @batchSize INT
DECLARE @iterationCount INT
DECLARE @i INT
DECLARE @from INT
DECLARE @to INT
SET @batchSize = 10000
SET @i = 0
SELECT @iterationCount = COUNT(*) / @batchSize
FROM MyTable
WHERE LitraID = 8175
AND id BETWEEN 100000000 AND 300000000
WHILE @i <= @iterationCount BEGIN
BEGIN TRANSACTION T
SET @startTime = GETDATE()
SET @from = @i * @batchSize
SET @to = (@i + 1) * @batchSize - 1
;WITH data
AS (
SELECT DoorsReleased, ROW_NUMBER() OVER (ORDER BY id) AS Row
FROM MyTable
WHERE LitraID = 8175
AND id BETWEEN 100000000 AND 300000000
)
UPDATE data
SET DoorsReleased = ~DoorsReleased
WHERE row BETWEEN @from AND @to
SET @i = @i + 1
COMMIT TRANSACTION T
END
Upvotes: 1
Views: 135
Reputation: 45096
This will eliminate the loop
UPDATE MyTable
set DoorsReleased = ~DoorsReleased
WHERE LitraID = 8175
AND id BETWEEN 100000000 AND 300000000
AND DoorsReleased is not null -- if DoorsReleased is nullable
-- AND DoorsReleased <> ~DoorsReleased</strike>
if you are set on looping
below will NOT work
I thought ~ was part of the column name but it is a not operator
select 1;
WHILE (@@ROWCOUNT > 0)
BEGIN
UPDATE top (100000) MyTable
set DoorsReleased = ~DoorsReleased
WHERE LitraID = 8175
AND id BETWEEN 100000000 AND 300000000
AND ( DoorsReleased <> ~DoorsReleased
or ( DoorsReleased is null and ~DoorsReleased is not null )
)
END
Inside a transaction I don't think looping would have value as the transaction log cannot clear. And a batch size of 10,000 is small.\
as stated in a comment if you want to loop then try using id as row_number() all those loops is expensive
you might be able to use OFFSET
Upvotes: 1
Reputation: 3002
One of your issues is that your select statement in the loop fetches all records for LitraID = 8175, sets row numbers, then filters in the update statement. This happens on every iteration.
One way round this would be to get all ids for the update before entering the loop and storing them in a temporary table. Then you can write a similar query to the one you have, but joining to this table of ids.
However, there is an even easier way if you know approximately how many records have LitraID = 8175 and if they are spread throughout the table, not bunched together with similar ids.
DECLARE @batchSize INT
DECLARE @minId INT
DECLARE @maxId INT
SET @batchSize = 10000 --adjust according to how frequently LitraID = 8175, larger numbers if infrequent
SET @minId = 100000000
WHILE @minId <= 300000000 BEGIN
SET @maxId = @minId + @batchSize - 1
IF @maxId > 300000000 BEGIN
SET @maxId = 300000000
END
BEGIN TRANSACTION T
UPDATE MyTable
SET DoorsReleased = ~DoorsReleased
WHERE id BETWEEN @minId AND @maxId
COMMIT TRANSACTION T
SET @minId = @maxId + 1
END
This will use the value of id to control the loop, meaning you don't need the extra step to calculate @iterationCount. It uses small batches so that the table isn't locked for long periods. It doesn't have any unnecessary SELECT statements and the WHERE clause in the update is efficient assuming id has an index.
It won't have exactly the same number of records updated in every transaction, but there's no reason it needs to.
Upvotes: 1