Russell
Russell

Reputation: 2076

Sharing tests between DBT models

I have a bunch of dbt models that share about 90% of their structure. The idea is that these models will be combined into a single unified downstream model during the dbt run. Currently my tests for the models have a lot of duplication. For example

- name: model1
  columns:
    - name: colA
      tests:
         - accepted_values: 
             - values ['a','b']
    - name: colB
      tests: 
         - non_null 
 
  
- name: model2
  columns:
    - name: colA
      tests:
         - accepted_values: 
             - values ['a','b','c']
    - name: colB
      tests: 
         - non_null 

I'd like to reduce the duplication in schema.yml file by re-using the test config with small variations.

What I have tried so far

  1. defining the tests as a var in dbt_project.yml and referencing it in the schema.yml . This works but you cannot have any variation

  2. defining a macro that returns a python list that has the test config and calling the macro like this

    columns: "{{ common_tests() }}"

This doesn't work as I get could not render {{ common_tests() }} 'common_tests' is undefined.

Interestingly it is possible to render yaml with a macro within individual tests within the yaml file, just not at the top level.

I feel there should be an easy(ish) solution here, I'm just not finding it. Thanks in advance.

Upvotes: 2

Views: 1364

Answers (1)

tconbeer
tconbeer

Reputation: 5815

If you don’t mind defining all these models in a single .yml file, you can use YAML anchors for this.

Josh Devlin has a nice write-up here:


version: 2

models:
  - name: model_one
    columns:
      - name: id
        tests: &unique_not_null
          - unique
          - not_null
      - name: col_a
      - name: col_b
  - name: model_two
    columns:
      - name: id
        tests: *unique_not_null
      - name: col_c
      - name: col_d

Josh’s example shows an anchor on the tests key for a single column, but you could also use an anchor on the columns key. That doesn’t work so well though, because even with the merge operator (<<), you would need to repeat everything if there is a single change in a single test. There is no YAML equivalent for repeating lists or list items, which is really what you need here.

Upvotes: 6

Related Questions