John Thomas
John Thomas

Reputation: 1105

Unioning complete and incomplete dataframes in Snowflake and SQL

So I have a df that looks like this

DF3

ID    field1   field2  field3
001   banana        1       y
001    apple        1       y
004   orange       21       n
005   orange       32       y

Now this table is DF3, it is a future state of Df2 which looks like this:

DF2

ID    field1   field2
001   banana        1
001    apple        1
003    apple        1
004   orange       21
005   orange       32

And DF2 follows DF1....

DF1

ID    field1
001   banana
001    apple
002   banana
003    apple
004   orange
005   orange

Think of it like DF3 is the complete record. I want the incomplete records from DF1 and DF2 in a table along with DF3.

I want my Final result to look like this:

ID    field1   field2  field3
001   banana        1       y
001    apple        1       y
002   banana     NULL    NULL
003    apple        1    NULL
004   orange       21       n
005   orange       32       y

I assume this can be done with some combinations of UNION but i am struggling on how to do so in snowflake.

Upvotes: 1

Views: 57

Answers (2)

Adrian White
Adrian White

Reputation: 1804

Hmm i'd do full outer joins just in case the ID's aren't perfectly in sync.

with df3 as (
select '001' ID,    'banana' field1,  1 field2, 'y' field3
union all select '001' ID,     'apple' field1,  1 field2, 'y' field3
union all select '004' ID,    'orange' field1, 21 field2, 'n' field3
union all select '005' ID,    'orange' field1, 32 field2, 'y' field3)
,df2 as (
      select '001' ID,    'banana' field1,  1 field2      
union all select '001' ID,     'apple' field1,  1 field2      
union all select '003' ID,     'apple' field1,  1 field2      
union all select '004' ID,    'orange' field1, 21 field2      
union all select '005' ID,    'orange' field1, 32 field2 )     
, df1 as (select '001' ID,    'banana' field1
union all select '001' ID,     'apple' field1
union all select '002' ID,    'banana' field1
union all select '003' ID,     'apple' field1
union all select '004' ID,    'orange' field1
union all select '005' ID,    'orange' field1)
select 
coalesce(df1.id, df2.id, df3.id) ID,
coalesce(df1.field1, df2.field1, df3.field1) field1,
coalesce(df2.field2, df3.field2) field2,
df3.field3
from df1 full outer join df2 on df1.id = df2.id full outer join df3 on         
df1.id = df3.id 
group by 1,2,3,4    

enter image description here

Upvotes: 1

Gordon Linoff
Gordon Linoff

Reputation: 1269953

This looks like left join:

select df1.id, df1.field1, df2.field2, df3.field3
from df1 left join
     df2
     on df1.id = df2.id and
        df1.field1 = df2.field2 left join
     df3
     on df2.id = df3.id and
        df2.field1 = df3.field1 and
        df2.field2 = df3.field2;

Upvotes: 1

Related Questions