Reputation: 12959
We are having a table with billion rows. We just want to UPDATE two columns to NULL for all rows. We were trying to do this in batches. But, we are concerned about the performance issues. We are having clustered columnstore index on this table. Does creating composite non-clustered index on these three columns would help ? or can we re-load the table to new table & swap the tables ?
Any inputs will be very helpful.
DECLARE @notNULLRecordsCount INT;
SET @notNULLRecordsCount = 1;
WHILE @notNULLRecordsCount > 0
BEGIN
BEGIN TRANSACTION;
UPDATE TOP (100000)
dbo.BillionRowsTable
SET Column1 = NULL,
Column2 = NULL,
Column3 = NULL
WHERE Column1 IS NOT NULL OR Column2 IS NOT NULL OR Column3 IS NOT NULL;
SET @notNULLRecordsCount = @@ROWCOUNT;
COMMIT TRANSACTION;
END
Upvotes: 1
Views: 2339
Reputation: 45096
My experience is that indexes actually slow this down as the indexes must be maintained.
Leaving a transaction open is not good. Wrapping all these in a transaction is not good. If you are going to wrap them all in a transaction then no purpose to breaking them up.
If you could key on just one column (no or
) it would be faster.
You can shorten this
select 1;
WHILE @@ROWCOUNT > 0
BEGIN
UPDATE TOP (100000)
dbo.BillionRowsTable
SET Column1 = NULL,
Column2 = NULL,
Column3 = NULL
WHERE Column1 IS NOT NULL
OR Column2 IS NOT NULL
OR Column3 IS NOT NULL
END
To get rid of the or
select 1;
WHILE @@ROWCOUNT > 0
BEGIN
UPDATE TOP (100000)
dbo.BillionRowsTable
SET Column1 = NULL,
Column2 = NULL,
Column3 = NULL
WHERE Column1 IS NOT NULL
END
select 1;
WHILE @@ROWCOUNT > 0
BEGIN
UPDATE TOP (100000)
dbo.BillionRowsTable
SET Column2 = NULL,
Column3 = NULL
WHERE Column2 IS NOT NULL
END
select 1;
WHILE @@ROWCOUNT > 0
BEGIN
UPDATE TOP (100000)
dbo.BillionRowsTable
SET Column3 = NULL
WHERE Column3 IS NOT NULL
END
Upvotes: 4
Reputation: 12959
Thanks for the suggestion by @Paparazzo, we have decided to do this as a two staged UPDATE, using single column. We do once using Column1 as NULL and another one using Column1 as NOT NULL, so that we cover all the rows.
select 1;
WHILE @@ROWCOUNT > 0
BEGIN
BEGIN TRANSACTION;
UPDATE TOP (100000)
dbo.BillionRowsTable
SET Column2 = NULL,
Column3 = NULL
WHERE Column1 IS NULL;
COMMIT TRANSACTION;
END
select 1;
WHILE @@ROWCOUNT > 0
BEGIN
BEGIN TRANSACTION;
UPDATE TOP (100000)
dbo.BillionRowsTable
SET Column1= NULL,
Column2 = NULL,
Column3 = NULL
WHERE Column1 IS NOT NULL;
COMMIT TRANSACTION;
END
Upvotes: 0