Bish
Bish

Reputation: 41

Is it possible to write a single dbt test which applies to all tables in the whole project?

To ensure that new data is constantly arriving in Snowflake without interruption, I want to add some tests to my dbt project. I want to use freshness_anomalies test in Elementary, which is a package supporting dbt.

I can't figure out a way to only set up this test once so that the test is done on all tables. What I can do so far is something like this: excerpt from sources.yml

But this way, I have to add these lines for the test for each and every table I am creating in dbt, and I have many...

Is it possible to write the test only once? Btw, every single table has the column "META_LOAD_DATETIME", so that shouldn't be an issue.

Upvotes: 4

Views: 2473

Answers (2)

paradocslover
paradocslover

Reputation: 3304

There's a way actually - but it's more of a workaround. This answer is to help people who want to write tests consisting of all models, not specific to this question.

When would you want a test that applies to all models? For eg: You want to test if the schema.yml of the project contains the right columns as present in the dbt pipeline models.

You can write a generic test which applies to all models. For OP's case, he will have to tweak the test maybe by looking at the source code of the freshness_anomalies test

{% test all_columns_defined_in_schema_present_in_models(model) %}
    {% if execute %}
        {% set models = [] %}
        -- Get the models from the DAG
        {% for node in graph.nodes.values() | selectattr("resource_type", "equalto", "model") %}
                {% do models.append(node) %}
        {% endfor %}

        -- Iterate on each model
        {% for model in models %}
  
            -- This has to be tweaked
            SELECT
            NULL AS placeholder
            FROM {{ model.database ~ "." ~ model.schema ~ "." ~ model.name }}
            WHERE 1=0
            {% for column_name in model.columns %}
                AND {{ column_name }} IS NULL
            {% endfor %}
            {% if not loop.last %}UNION ALL{% endif %}

        {% endfor %}
    {% endif %}
{% endtest %}

The source for this comes from the discussion going on here - https://github.com/dbt-labs/dbt-core/issues/1570#issuecomment-1500395582

Upvotes: 1

tconbeer
tconbeer

Reputation: 5815

No, unfortunately for you, tests are Properties, not Configs, so they can only be defined in "Property" files (i.e., schema.yml files). Source. This means you need to write out the test property on every model/seed/source/etc. that you want it to run on.

.yml files are easy to generate programmatically, so if you wanted to do this hundreds or thousands of times, I would write a script for that. You could also use something like dbt_pre_commit to enforce the presence of this test for new models.

Upvotes: 1

Related Questions