Reputation: 171
Is replace is better than ltrim/rtrim. I have no spaces between the words, because I am running it on key column.
update [db14].[dbo].[S_item_60M]
set [item_id]=ltrim(rtrim([item_id]))
Item_id having non-clustered index
Shall I disable index for better performance?
Windows 7, 24GB RAM , SQL Server 2014
This query was running for 20 hours and then I canceled it. I am thinking to run Replace instead of ltrim/rtrim for performance reasons.
SSMS studio crashed.
Now I can see it running in Activity Monitor
Error Log says FlushCache: cleaned up 66725 bufs with 25872 writes in 249039 ms (avoided 11933 new dirty bufs) for db 7:0
Please guide and suggest me.
Upvotes: 0
Views: 2125
Reputation: 5694
If there are some rows which already have no spaces, exclude them from the UPDATE by using a WHERE clause such as CHARINDEX(' ',item_id)<>0. But the most important advice (already posted above by gvee) is to do the UPDATE in batches (if you have a key which you can use for paging). Another aproach (possibly better if you have enough space) would be to use an operation that can be minimally logged (in the bulk-logged or simple recovery model): use a SELECT INTO another table and then rename that table.
Upvotes: 1
Reputation: 46233
I don't think REPLACE
versus LTRIM
/TRIM
is the long pole in the tent performance wise. Do you have concurrent activity against the table during the update? I suggest you perform this operation during a maintenance window to avoid blocking with other queries.
If a lot of rows will be updated (more than 10% or so) I suggest you drop (or disable) the non-clustered index on item_id column, perform the update, and then create (or enable) the index afterward. Specify the TABLOCKX
locking hint.
Upvotes: 1
Reputation: 171206
The throughput of bulk updates does not depend on a single call per row to ltrim
or rtrim
. You arbitrarily pick some highly visible element of your query and consider it responsible for bad performance. Look at the query plan to see what's being done physically. Also, make yourself familiar with bulk update techniques (such as dropping and recreating indexes).
Note, that contrary to popular belief a bulk update with all rows in one statement is usually the fastest option. This strategy can cause blocking and high log usage. But is usually has the best throughput because the optimizer can optimize all the DML that you are executing in one plan. If splitting DML into chunks was almost always a good idea SQL Server would just do it automatically as part of the plan.
Upvotes: 1