Paul Alan Taylor
Paul Alan Taylor

Reputation: 10680

What is the most efficient way to update a table during a merge operation?

Like many developers, I perform a lot of merge operations with data, specifically SQL Server.

Historically, I have used the old trick of:-

1) Doing a left join on the existing data, and inserting anything I don't have a corresponding record for.

2) After 1), updating rows in my target table.

I have to take a performance hit on 1). It's unavoidable. However, on 2), I have been rather profligate. Instead of just updating stuff that needs updating, I've updated everything I've matched ( whether the underlying data has changed or not ).

Turns out that SQL Server isn't too smart about this sort of update. It performs no pre-check to determine that what you are about to update isn't the same thing as what you are using to update it. Hence, updates done along these lines result in a physical write and impact any indexes that reference the field.

So, from my POV, my choices are as follows:-

1) Carry on as normal, basking in the current profligacy of my routine (and refreshing indexes daily on large DBs)

2) Write more UPDATE statements that update a specific field if the field has changed.

e.g.

UPDATE
    p2 
SET
    [SpecificField] = p1.[SpecificField]
FROM
    @source p1,
    Dest p2
WHERE
    p2.ExternalKey = p1.ExternalKey
AND COALESCE(p1.[SpecificField],'') <> COALESCE(p2.[SpecificField],'')

3) Something infinitely better that the Stack Overflow community suggests.

I'd really like to go with 3). Are my options really limited to 1 or 2? Note. I have looked into MERGE INTO. Same problems, really.

Upvotes: 4

Views: 2380

Answers (1)

bhamby
bhamby

Reputation: 15469

With MERGE INTO, you have the option of adding an additional search clause to a WHEN (NOT) MATCHED clause. For example,

MERGE INTO table_to_upsert AS target
 USING source_table    AS source
    ON target.key1 = source.key1
   AND target.key2 = source.key2
   AND target.key3 = source.key3
WHEN MATCHED AND (target.value <> source.value) THEN
            UPDATE SET target.value     = source.value,
                       target.timestamp = source.timestamp
WHEN MATCHED AND (target.userid <> source.userid) THEN
            UPDATE SET target.userid    = source.userid,
                       target.timestamp = source.timestamp
WHEN NOT MATCHED THEN
            INSERT (key1, key2, key3, 
                    value, userid, timestamp)
            VALUES (source.key1, source.key2, source.key3,
                    source.value, source.userid, source.timestamp)

However, this doesn't really solve your problem if multiple columns are updated at a time, the MERGE would take the first WHEN (NOT) MATCHED that evaluates true (similar to a CASE statement).

Upvotes: 2

Related Questions