Peter
Peter

Reputation: 9

Performance issues with Replace Function in T-SQL

I have a large table that i am working on and for nearly all of the columns i need to use the replace statement to remove single and double quotes. The code looks like this:

SET QUOTED_IDENTIFIER ON

Update dbo.transactions set transaction_name1 = Replace(transaction_name1,'''','')
Update dbo.transactions set transaction_name2 = Replace(transaction_name2,'''','')
Update dbo.transactions set transaction_name3 = Replace(transaction_name3,'''','')
Update dbo.transactions set transaction_name4 = Replace(transaction_name4,'''','')
Update dbo.transactions set transaction_name5 = Replace(transaction_name5,'''','')

I have not put an index on the table as was not sure exactly what column would be any good being that i'm updating nearly all the columns. If i sorted the table asc by the primary key would that help increase performance?

Over then that the statements have been running for over 2 hours with no error messages and wondered if there is a solution to this performance issue, besides the usual more hardware changes? If someone could advise on ways of increasing performance of the script.

Cheers, Peter

Upvotes: 0

Views: 4236

Answers (3)

Michael Haren
Michael Haren

Reputation: 108336

You can make this a single UPDATE statement:

UPDATE transactions SET
  transaction_name1 = Replace(transaction_name1,'''',''),
  transaction_name2 = Replace(transaction_name2,'''','')
  ... (and so on)

That would likely improve the performance by something approaching a factor of 5.


Edit:

Since this is a one shot thing on a huge dataset (90MM rows), I suggest adding in a where clause and running it in batches.

If your transactions have a primary key, partition the updates on that, doing maybe 500k at once.

Do this in a loop with explicit transactions to keep your log use to a minimum:

DECLARE @BaseID INT, @BatchSize INT
SELECT @BaseID = MAX(YourKey), @BatchSize = 500000 FROM transactions

WHILE @BaseID > 0 BEGIN

  PRINT 'Updating from ' + CAST(@BaseID AS VARCHAR(20))

  -- perform update
  UPDATE transactions SET
    transaction_name1 = Replace(transaction_name1,'''',''),
    transaction_name2 = Replace(transaction_name2,'''','')
    -- ... (and so on)
  WHERE YourKey BETWEEN @BaseID - @BatchSize AND @BaseID

  SET @BaseID = @BaseID - @BatchSize - 1

END

Another note:

If the quotes must not appear in your data, you can create a check constraint to keep them out. It's a last ditch effort as any app attempting to put them in would need to handle a database exception, but it will keep your data clean. Something like this might do it:

ALTER TABLE transactions
  ADD CONSTRAINT CK_NoQuotes CHECK(
    CHARINDEX('''',transaction_name1)=0 AND
    CHARINDEX('''',transaction_name2)=0 AND
    -- and so on...
  )

Upvotes: 4

KM.
KM.

Reputation: 103637

You might try making this only a single UPDATE and only updating the rows that need it:

UPDATE dbo.transactions
SET transaction_name1 = REPLACE(transaction_name1,'''',''),
    transaction_name2 = REPLACE(transaction_name2,'''',''),
    transaction_name3 = REPLACE(transaction_name3,'''',''),
    transaction_name4 = REPLACE(transaction_name4,'''',''),
    transaction_name5 = REPLACE(transaction_name5,'''','')
WHERE 
    CHARINDEX('''',transaction_name1)>0
    OR CHARINDEX('''',transaction_name2)>0
    OR CHARINDEX('''',transaction_name3)>0
    OR CHARINDEX('''',transaction_name4)>0
    OR CHARINDEX('''',transaction_name5)>0

Upvotes: 1

Yvo
Yvo

Reputation: 19263

You can combine the statements, that might be a bit faster:

SET QUOTED_IDENTIFIER ON

UPDATE dbo.transactions
SET transaction_name1 = REPLACE(transaction_name1,'''',''),
    transaction_name2 = REPLACE(transaction_name2,'''',''),
    transaction_name3 = REPLACE(transaction_name3,'''',''),
    transaction_name4 = REPLACE(transaction_name4,'''',''),
    transaction_name5 = REPLACE(transaction_name5,'''','')

Also, check out the estimated execution plan. It might give you a useful advice in how to optimize your database / query (it's a little square button in the button bar of SQL Management Studio).

Upvotes: 2

Related Questions