Snowy
Snowy

Reputation: 6132

SQL Server Merge and Indexing Speed

I have a merge statement that needs to compare on many columns. The source table has 26,000 rows. The destination table has several million rows. The desintation table only has a typical Primary Key index on an int-type column.

I did some selects with group by to count the number of unique values in the source.

The test part of the Merge is

Merge Into desttable
Using #temptable
On 
(
desttable.ColumnA = #temptable.ColumnA  
and
desttable.ColumnB  =  #temptable.ColumnB  
and
desttable.ColumnC  = #temptable.ColumnC 
and
desttable.ColumnD  = #temptable.ColumnD 
and
desttable.ColumnE  = #temptable.ColumnE 
and
desttable.ColumnF  = #temptable.ColumnF
) 
When Not Matched Then  Insert Values (.......)

-- ColumnA: 167 unique values in #temptable
-- ColumnB: 1 unique values in #temptable
-- ColumnC: 13 unique values in #temptable
-- ColumnD: 89 unique values in #temptable
-- ColumnE: 550 unique values in #temptable
-- ColumnF: 487 unique values in #temptable

-- ColumnA: 3690 unique values in desttable
-- ColumnB: 3 unique values (plus null is possible) in desttable
-- ColumnC: 1113 unique values in desttable
-- ColumnD: 2662 unique values in desttable
-- ColumnE: 1770 unique values in desttable
-- ColumnF: 1480 unique values in desttable

The merge right now takes a very, very long time. I think I need to change my primary key but am not sure what the best tactic might be. 26,000 rows can be inserted on the first merge, but subsequent merges might only have ~2,000 inserts to do. Since I have no indexes and only a simple PK, everything is slow. :)

Can anyone point out how to make this better?

Thanks!

Upvotes: 0

Views: 2538

Answers (2)

Lester S
Lester S

Reputation: 730

My suggestion is if you only need to run it once, then Merge statement is acceptable if time is not that critical. But, if you're going to use the script more often, I think it'll be better if you do it step by step instead of using the Merge statement. Step by step, like creating your own select, insert, update, delete statements in order to attain the goal. With this you'll have more control almost on everything(query optimization, indexing, etc...)

In your case, probably separating the 6 where criteria might be more efficient than combining them all at once. Downside is you'll have longer script.

Upvotes: 1

marc_s
marc_s

Reputation: 755541

Well, an obvious candidate would be an index on the columns you use to do your matching in the MERGE statement - do you have an index on (ColumnA, ColumnB, ColumnC, ColumnD, ColumnE, ColumnF) on your destitation table??

This tuple of columns is being used to determine whether or not a row from your source table already exists in the database. If you don't have that index nor any other usable index in place, you get a table scan on the large destination table for each row in your source table, basically.

If not: I would try to add it and then see how the runtime behavior changes. Does the MERGE now run a little less than a very, very long time??

Upvotes: 2

Related Questions