x89
x89

Reputation: 3480

insert into table where not exists (dbt)

I want to insert new data into the STG.NEW_TABLE when the id does not already exist in that table.

Since we can't use insert statements in dbt, I am not sure how to convert this query for dbt:

insert into STG.NEW_TABLE
select FILE_NAME::string                  as file_name,
       id::string                         as id, 
       XMLGET(f.value,'value'):"$"::string  as val
from STG.OLD_TABLE ot,
lateral flatten(input => XMLGET( ot.XML_DATA, 'data' ):"$") f
where not exists (select id 
                  from stg.NEW_TABLE nt
                  where nt.id=ot.id);

I tried breaking it into two parts where I first flatten the xml into a TMP view and then do a union in another model:

(SELECT * FROM {{ref('TMP_VIEW')}} where id NOT IN
    (SELECT id 
     FROM STG.STG.NEW_TABLE))

UNION ALL

SELECT * FROM STG.NEW_TABLE

as suggested here: https://docs.getdbt.com/guides/migration/tools/migrating-from-stored-procedures/2-inserts

While it seems to work, it doesn't feel like the most efficient solution. I do not want to split the process into two parts. I am gussing something with CTEs might be better. Any recommendations?

Upvotes: 0

Views: 3314

Answers (1)

Simeon
Simeon

Reputation: 836

In dbt you can set the materialization of a table to be incremental in the schema.yml file or within the sql file itself with:

{{ config(materialized='incremental', dist='id') }} https://docs.getdbt.com/docs/build/materializations#incremental

This will only insert records that don't already exist within the table

edit based on comments

What you can do is union the new and old table, then rank the id to retrieve a unique one. This allows the model to be incremental, and by specifying which data comes from the old/new table you can ensure only new records are inserted.

{{ config(materialized='incremental', dist='id') }}

with cte as ( 
    select *, 0 oldtable from stg.NEW_TABLE
    
    union 
    
    select FILE_NAME::string                  as file_name,
           id::string                         as id, 
           XMLGET(f.value,'value'):"$"::string  as val,
           1 oldtable
    from STG.OLD_TABLE ot,
    lateral flatten(input => XMLGET( ot.XML_DATA, 'data' ):"$") f
    where not exists (select id 
                      from stg.NEW_TABLE nt
                      where nt.id=ot.id)
    
)

final as ( 
    select *, 
        row_number() OVER ( order by oldtable, id ) rn
    from cte 
)

select
    file_name,
    id,
    val
from final
where rn = 1
and oldtable = 1

Upvotes: 1

Related Questions