Llus
Llus

Reputation: 55

Specify target in dbt model based on the folder location

DBT newbie here!

I am building a multiaccount dbt project and I don't know how I can make the target depend on the folder where each model is located.

This is my desired folder configuration:

└───models
    ├───account1
    │   ├───DB1
    │   │   └───SCHEMA1
    │   ├───DB2
    │   │   ├───SCHEMA2
    │   │   └───SCHEMA3
    │   └───DB3
    │       └───SCHEMA4
    └───account2
        ├───DB4
        │   └───SCHEMA5
        └───DB5
            ├───SCHEMA6
            └───SCHEMA7

I have configured in profiles.yml a dev profile with two targets: account1 and account2. How can I get DBT to run models in account1 folder using account1 as target and models in account2 folder using account2 target?

I have accomplished it by running:

dbt run --models account1 --target account1
dbt run --models account2 --target account2

But is there a better way to do this? I would like to avoid having to run two consequent runs.

I've also seen it is recommended to split the repo into two, but I would really like to keep the mono-repo approach.

Upvotes: 3

Views: 15075

Answers (2)

CallumO
CallumO

Reputation: 301

Hey 👋 Bit of a late one, but I'm trying to do the exact same thing as you as I'm actually moving from a polyrepo to a monorepo.

It's frustrating that dbt is built off of being super dynamic, but you can't switch out targets in a single invocation.

I'm not sure how similar your two targets are or what database provider you're using. But depending on what properties you need to switch you might be able to do it using macros.

In my case, I just needed to switch out the role and warehouse values (Snowflake) so I used a pre-hook:-

models:
  my_project:
    +pre-hook: [ "USE ROLE {{ generate_role_name(this) }};",
                 "USE WAREHOUSE {{ generate_warehouse_name(this, 'XS') }};"]
    database_1:
      +database: 'DB_ONE'
    database_2:
      +database: 'DB_TWO'

Then I had macros which evaluated the database name and returned the appropriate role and warehouse:-

{% macro generate_role_name(this) %}


    {%- set default_role = target.role -%}

    {%- if execute %}
        {%- if target.name == "sandbox" -%}

             {{ default_role }}
             
        {%- else -%}
            
            {% set role_name = this.database ~ "_WORKER" %}

            {% do log("💁 Setting role to: " ~ role_name ~ " for model: " ~ this.name, info=True) %}
            
             {{ role_name }}

        {%- endif -%}

    {%- endif -%}

{% endmacro %}
{% macro generate_warehouse_name(this, wh_size) %}


    {%- set default_warehouse = target.warehouse -%}

    {%- if execute %}
        {%- if target.name == "sandbox" -%}

             {{ default_warehouse }}

        {%- else -%}

            {% set database_name_parts = this.database.split('_') %}
            {% set modified_database_name = database_name_parts[1:] | join('_') %}
            {% set warehouse_name = modified_database_name ~ "_WH_" ~ wh_size %}

            {% do log("⚙️  Setting warehouse to: " ~ warehouse_name ~ " for model: " ~ this.name, info=True) %}
            
             {{ warehouse_name }}

        {%- endif -%}

    {%- endif -%}

{% endmacro %}

Again, this is what worked for me as I only needed to switch out the role and warehouse (dbt can switch database and schema using the built in configs ).

I'm not sure how much of your target needs switching out between the two accounts/directories. But thought i'd share my solution in case there was overlap 😊

Upvotes: 0

Adam Kipnis
Adam Kipnis

Reputation: 11001

You can specify the models you want to run as a path (https://docs.getdbt.com/reference/node-selection/syntax#examples). The --target option is used to tell dbt which profile you want to use when running a job and has nothing to do with the models you want executed.

dbt run -s path:models/account1
dbt run -s path:models/account2

You can also configure this through a selector. Create a file called selectors.yml in your project root directory

selectors:
  - name: foo
    definition:
      union:
        - method: path
          value: "models/account1"
        - method: path
          value: "models/account2"

Then run it with

dbt run --selector foo

Note that this approach combines both into a single run. You can split it into multiple tasks if you want them to be completely independent.

Upvotes: 4

Related Questions