What is the most efficient way to update a table during a merge operation?

Question

Like many developers, I perform a lot of merge operations with data, specifically SQL Server.

Historically, I have used the old trick of:-

1) Doing a left join on the existing data, and inserting anything I don't have a corresponding record for.

2) After 1), updating rows in my target table.

I have to take a performance hit on 1). It's unavoidable. However, on 2), I have been rather profligate. Instead of just updating stuff that needs updating, I've updated everything I've matched ( whether the underlying data has changed or not ).

Turns out that SQL Server isn't too smart about this sort of update. It performs no pre-check to determine that what you are about to update isn't the same thing as what you are using to update it. Hence, updates done along these lines result in a physical write and impact any indexes that reference the field.

So, from my POV, my choices are as follows:-

1) Carry on as normal, basking in the current profligacy of my routine (and refreshing indexes daily on large DBs)

Pros: it's easy.
Cons: it's crap.

2) Write more UPDATE statements that update a specific field if the field has changed.

e.g.

UPDATE
    p2 
SET
    [SpecificField] = p1.[SpecificField]
FROM
    @source p1,
    Dest p2
WHERE
    p2.ExternalKey = p1.ExternalKey
AND COALESCE(p1.[SpecificField],'') <> COALESCE(p2.[SpecificField],'')

Pros: it's highly specific, only updating when an update is required.
Cons: lot of different update statements for tables with many columns.

3) Something infinitely better that the Stack Overflow community suggests.

I'd really like to go with 3). Are my options really limited to 1 or 2? Note. I have looked into MERGE INTO. Same problems, really.

What is the most efficient way to update a table during a merge operation?

Answers (1)

Related Questions