xingkong
xingkong

Reputation: 109

How to optimize SQL Server Merge statement running with millions of records

I use SQL Server 2014 and need to update a new added datetime type column in one table. There are two tables related (both have > 30 millions of records):

TableA:

CategoryID, itemID, dataCreated, deleted, some other string properties. 

This table contains multiples records for each item with different datecreated.

TableB:

CategoryID, itemID, LatestUpdatedDate (This is the new added column)

both categoryID and itemID are part of an index on this table.

To update tableB's LatestUpdatedDate from table A on matched CategoryID and ItemID, I used the following merge statement:

merge [dbo].[TableB] with(HOLDLOCK) as t
using 
(
    select CategoryID,itemID, max(DateCreated) as LatestUpdatedDate 
    from dbo.TableA 
    where TableA.Deleted = 0
    group by CategoryID,itemID
) as s on t.CategoryID = s.CategoryID and t.itemID = s.itemID

when matched then
    update
    set t.LatestUpdatedDate = s.LatestUpdatedDate

when not matched then
    insert (CategoryID, itemID, LatestUpdatedDate)
    values (s.CategoryID, s.itemID)

Given the fact that millions of records in both table, How can I optimize this script? Or Is there any other way to update the table with better performance?

Note: This is a one-off script and DB is on live, there would be a trigger added to tableA against insert to update the date in tableB in the future.

Upvotes: 5

Views: 38241

Answers (1)

Serge
Serge

Reputation: 4036

As per Optimizing MERGE Statement Performance, the best you can do is:

  • Create an index on the join columns in the source table that is unique and covering.
  • Create a unique clustered index on the join columns in the target table.

You may get a performance improvement during MERGE1 by creating an index on TableA on (Deleted, CategoryID, itemID) INCLUDE(DateCreated). However, since this is a one-off operation, the resources (time, CPU, space) required to create this index probably won't offset the performance gains vis-a-vis running the query as-is and relying on your existing index.

Upvotes: 13

Related Questions