Stefan Collier
Stefan Collier

Reputation: 4682

SQL Server - coalesce data on duplicate keys when MERGE-ing

I've stumbled across an annoying situation where my source query results have duplicate keys with differing data. Unfortunately I need to back-fill any NULLs. I tried with a MERGE but I get a key error.

The equivalent query in MySQL (that I cannot convert) is:

Please note that I have changed all the field and table names

INSERT INTO user_brief (name, high_score, colour)
SELECT
  u.name,
  h.high_score,
  p.colour,
FROM foo_table AS f
    LEFT JOIN users AS u       ON f.user_id = u.id
    LEFT JOIN high_scores AS h ON f.user_id = h.id
    LEFT JOIN preferences AS p ON f.user_id = p.id
ON DUPLICATE KEY
UPDATE
    name        = COALESCE(user_brief.name,             VALUES(name)),
    high_score  = COALESCE(user_brief.high_score,       VALUES(high_score)),
    colour      = COALESCE(user_brief.colour,           VALUES(colour));

SELECT Query Results

If we take just the SELECT you would get the following results:

name | high_score  | color
---------------------------
foo  | NULL        | brown
foo  | 40          | NULL
bar  | 29          | blue
...

Desired Results

name | high_score | color
---------------------------
foo  | 40         | brown
bar  | 29         | blue
...

As you can see it has flattened (not sure if that's the correct term) taking the first non-null value for each column of a name keyed record.


My attempted MERGE solution (but it gets key errors):

MERGE INTO user_brief AS target
USING (SELECT
      u.name,
      h.high_score,
      p.colour,
    FROM foo_table AS f
        LEFT JOIN users AS u       ON f.user_id = u.id
        LEFT JOIN high_scores AS h ON f.user_id = h.id
        LEFT JOIN preferences AS p ON f.user_id = p.id) AS source
    ON target.name = source.name
WHEN MATCHED THEN
    UPDATE SET 
        target.name       = COALESCE(source.name,       target.name),
        target.high_score = COALESCE(source.high_score, target.high_score),
        target.colour     = COALESCE(source.colour,     target.colour)
WHEN NOT MATCHED BY TARGET THEN
    INSERT (name, high_score, colour)
    VALUES (source.name, source.high_score, source.colour);

Upvotes: 2

Views: 957

Answers (1)

Lukasz Szozda
Lukasz Szozda

Reputation: 175596

You could use GROUP BY to flatten source:

WITH source AS (
    SELECT
      u.name,
      high_score = MIN(h.high_score),
      colour = MIN(p.colour)
    FROM foo_table AS f
        LEFT JOIN users AS u       ON f.user_id = u.id
        LEFT JOIN high_scores AS h ON f.user_id = h.id
        LEFT JOIN preferences AS p ON f.user_id = p.id
    GROUP BY u.name
)
MERGE INTO user_brief AS target
USING source
  ON target.name = source.name
WHEN MATCHED THEN
    UPDATE SET 
        target.name       = COALESCE(source.name,       target.name),
        target.high_score = COALESCE(source.high_score, target.high_score),
        target.colour     = COALESCE(source.colour,     target.colour)
WHEN NOT MATCHED BY TARGET THEN
    INSERT (name, high_score, colour)
    VALUES (source.name, source.high_score, source.colour);

Upvotes: 1

Related Questions