Heluvaworld
Heluvaworld

Reputation: 31

Merging datasets with NULL values

Here's the scenario I'm trying to solve. I'm doing this in snowflakeDB but would love to see the approaches in other databases too. I've two sources tables S1 and S2. I need to outer join these tables based on id and load into a target table T.

I'm using a SQL merge statement in order to load into the target T based on the key columns. Not necessarily record will exist in S2 but it can arrive in the future.


Source S1
==========
ColA  ColB  EMP_ID
-------------------
Hello Fred  100
Hello Will  101

Source S2
==============
Colc   EMP_ID
-------------
World  101
Test   101

Target T:
==========
ColA | ColB | EMP_ID | ColC
-------------------------
Hello| Fred | 100    | NULL
Hello| Will | 101    | World
Hello| Will | 101    | Test 

In future:

Source S2 will have the records for EMP_ID 100.

Source S2
===========

Colc   EMP_ID
-------------
World  101
Test   101
Unit   100
Sale   100

Target T should look like below

Target T:
==========
ColA | ColB | EMP_ID | ColC
-------------------------
Hello| Fred | 100    | Unit
Hello| Fred | 100    | Sale
Hello| Will | 101    | World
Hello| Will | 101    | Test

Instead of below

Target T:
==========
ColA | ColB | EMP_ID | ColC
-------------------------
Hello| Fred | 100    | NULL
Hello| Fred | 100    | Unit
Hello| Fred | 100    | Sale
Hello| Will | 101    | World
Hello| Will | 101    | Test

I could delete the records and re-insert. Are there any other options that I could try?

Sample Code below:

MERGE INTO T 
using (SELECT COLA, COLB, ID, COLC 
             from S1 left outer join S2
               on S1.EMP_ID = S2.EMP_ID) Source
ON (NVL(Source.EMP_ID,0) = NVL(Target.EMP_ID,0))
  WHEN MATCHED THEN UPDATE
  WHEN NOT MATCHED THEN INSERT

Upvotes: 0

Views: 1762

Answers (1)

Rachel McGuigan
Rachel McGuigan

Reputation: 541

I was trying to help reproduce, but am having a darn time with the Merge statement in Snowflake. I believe your syntax for Merge is not correct. This should work in Snowflake.

CREATE TABLE S1
(Cola varchar(10),Colb varchar(10), EMP_ID number);

CREATE TABLE S2
(Colc varchar(10), EMP_ID number);

INSERT INTO S1 values ('Hello', 'Fred',  100), 
('Hello', 'Will',  101);
INSERT INTO S2 VALUES ('World',  101),
('Test',   101);

SELECT * FROM S1; 
SELECT * FROM S2;

CREATE TABLE T (Cola varchar, Colb varchar, EMP_ID number, Colc varchar);

//ref https://docs.snowflake.net/manuals/sql-reference/sql/merge.html
//SELECT COLA, COLB, S1.EMP_ID, COLC 
//FROM S1 LEFT OUTER JOIN S2 ON S1.EMP_ID = S2.EMP_ID;

MERGE INTO T USING ((SELECT COLA, COLB, s1.EMP_ID, COLC 
         from S1 LEFT OUTER JOIN S2
            on S1.EMP_ID = S2.EMP_ID)) a ON (NVL(a.EMP_ID,0) = NVL(T.EMP_ID,0)) 

WHEN MATCHED THEN DELETE WHEN NOT MATCHED THEN INSERT(cola, colb, emp_id, colc) VALUES (a.cola, a.colb,a.emp_id, a.colc);

Upvotes: 1

Related Questions