user3380585
user3380585

Reputation: 171

TSQL : Is replace better than ltrim/rtrim

Is replace is better than ltrim/rtrim. I have no spaces between the words, because I am running it on key column.

  update [db14].[dbo].[S_item_60M]
  set [item_id]=ltrim(rtrim([item_id]))

Item_id having non-clustered index

Shall I disable index for better performance?

Windows 7, 24GB RAM , SQL Server 2014

This query was running for 20 hours and then I canceled it. I am thinking to run Replace instead of ltrim/rtrim for performance reasons.

SSMS studio crashed.

Now I can see it running in Activity Monitor

Error Log says  FlushCache: cleaned up 66725 bufs with 25872 writes in 249039 ms (avoided 11933 new dirty bufs) for db 7:0

Please guide and suggest me.

Upvotes: 0

Views: 2125

Answers (3)

Razvan Socol
Razvan Socol

Reputation: 5694

If there are some rows which already have no spaces, exclude them from the UPDATE by using a WHERE clause such as CHARINDEX(' ',item_id)<>0. But the most important advice (already posted above by gvee) is to do the UPDATE in batches (if you have a key which you can use for paging). Another aproach (possibly better if you have enough space) would be to use an operation that can be minimally logged (in the bulk-logged or simple recovery model): use a SELECT INTO another table and then rename that table.

Upvotes: 1

Dan Guzman
Dan Guzman

Reputation: 46233

I don't think REPLACE versus LTRIM/TRIM is the long pole in the tent performance wise. Do you have concurrent activity against the table during the update? I suggest you perform this operation during a maintenance window to avoid blocking with other queries.

If a lot of rows will be updated (more than 10% or so) I suggest you drop (or disable) the non-clustered index on item_id column, perform the update, and then create (or enable) the index afterward. Specify the TABLOCKX locking hint.

Upvotes: 1

usr
usr

Reputation: 171206

The throughput of bulk updates does not depend on a single call per row to ltrim or rtrim. You arbitrarily pick some highly visible element of your query and consider it responsible for bad performance. Look at the query plan to see what's being done physically. Also, make yourself familiar with bulk update techniques (such as dropping and recreating indexes).

Note, that contrary to popular belief a bulk update with all rows in one statement is usually the fastest option. This strategy can cause blocking and high log usage. But is usually has the best throughput because the optimizer can optimize all the DML that you are executing in one plan. If splitting DML into chunks was almost always a good idea SQL Server would just do it automatically as part of the plan.

Upvotes: 1

Related Questions