Reputation: 3480
I want to insert new data into the STG.NEW_TABLE
when the id
does not already exist in that table.
Since we can't use insert statements in dbt, I am not sure how to convert this query for dbt:
insert into STG.NEW_TABLE
select FILE_NAME::string as file_name,
id::string as id,
XMLGET(f.value,'value'):"$"::string as val
from STG.OLD_TABLE ot,
lateral flatten(input => XMLGET( ot.XML_DATA, 'data' ):"$") f
where not exists (select id
from stg.NEW_TABLE nt
where nt.id=ot.id);
I tried breaking it into two parts where I first flatten the xml into a TMP view and then do a union in another model:
(SELECT * FROM {{ref('TMP_VIEW')}} where id NOT IN
(SELECT id
FROM STG.STG.NEW_TABLE))
UNION ALL
SELECT * FROM STG.NEW_TABLE
as suggested here: https://docs.getdbt.com/guides/migration/tools/migrating-from-stored-procedures/2-inserts
While it seems to work, it doesn't feel like the most efficient solution. I do not want to split the process into two parts. I am gussing something with CTEs might be better. Any recommendations?
Upvotes: 0
Views: 3314
Reputation: 836
In dbt you can set the materialization of a table to be incremental in the schema.yml
file or within the sql file itself with:
{{ config(materialized='incremental', dist='id') }}
https://docs.getdbt.com/docs/build/materializations#incremental
This will only insert records that don't already exist within the table
What you can do is union the new and old table, then rank the id to retrieve a unique one. This allows the model to be incremental, and by specifying which data comes from the old/new table you can ensure only new records are inserted.
{{ config(materialized='incremental', dist='id') }}
with cte as (
select *, 0 oldtable from stg.NEW_TABLE
union
select FILE_NAME::string as file_name,
id::string as id,
XMLGET(f.value,'value'):"$"::string as val,
1 oldtable
from STG.OLD_TABLE ot,
lateral flatten(input => XMLGET( ot.XML_DATA, 'data' ):"$") f
where not exists (select id
from stg.NEW_TABLE nt
where nt.id=ot.id)
)
final as (
select *,
row_number() OVER ( order by oldtable, id ) rn
from cte
)
select
file_name,
id,
val
from final
where rn = 1
and oldtable = 1
Upvotes: 1