PacMan
PacMan

Reputation: 1358

incremental load with bigquery isn't working with dbt unique_key

I'm trying to build an incremental model using dbt on BigQuery and it seems like I'm missing some detail on my code or in the way how incremental models works.

I am building a unique hash of my five functional keys which are POSStoreCode, StoreSystemType, TargetStartDate, TargetEndDate and TargetSalesType

My incremental model looks like this.

{{ config(materialized='incremental', alias='Sales_Target_Oblique_v1', schema='bq_ed_harmonized_'+var('env'), unique_key='Checksum') }}


with salestarget as (

  select
    store_code as POSStoreCode,
    store_name as StoreName,
    'SID' as StoreSystemType,
    target_qty as TargetQuantity,
    target_start_date as TargetStartDate,
    target_end_date as TargetEndDate,
    concat('Women ', ProductCategory) as TargetSalesType,
    cast(NULL as float64) as TargetAmount,
    '{{ var("transaction_id") }}' as ProcessingId,
    CURRENT_DATETIME() as CreationDatetime,
    CURRENT_DATETIME() as LastUpdateDatetime
  from {{ ref('salestarget') }} ),

  harmonized_salestarget as (
  select
    * ,
    {{ dbt_utils.surrogate_key(['POSStoreCode', 'StoreName', 'StoreSystemType', 'TargetQuantity', 'TargetStartDate', 'TargetEndDate', 'TargetSalesType', 'TargetAmount']) }} as Checksum
  from salestarget 
)

select * from harmonized_salestarget
{% if is_incremental() %}
    where Checksum != {{ dbt_utils.surrogate_key(['POSStoreCode', 'StoreName', 'StoreSystemType', 'TargetQuantity', 'TargetStartDate', 'TargetEndDate', 'TargetSalesType', 'TargetAmount']) }}
{% endif %}

When I first execute the the model, it works well by creating the table Sales_Target_Oblique_v1 and loading the data there, in addition when I re execute it again it doesn't merge anything which is normal.

Now that I'm updating the table source {{ ref('salestarget') }} manually on some values for example setting a new value for the TargetQuantity and when I execute the model again, it is supposed to detect a change happened and merge it since the new calculated checksum will be different from the old one, but it doesn't work, it merges 0 rows at the end.

Is there any problem with my checksum ?

Upvotes: 0

Views: 1489

Answers (1)

joellabes
joellabes

Reputation: 138

To add data to a dbt incremental model, you need to return only the new rows you would like appended (or updated) to the existing table.

In your case, you are returning no new rows because you are comparing the checksum of the query to itself (because you're selecting from harmonized_salestarget). Effectively you're saying where 1 != 1.

Your checksum definition at the top of the file is correct - what you want to do is change the bottom part of your query to find records that have changed since the last run of the model. Do you have a modified date column you can filter on for example?

These docs will help get you on the right track: https://docs.getdbt.com/docs/building-a-dbt-project/building-models/configuring-incremental-models

Upvotes: 1

Related Questions