Brian Postow
Brian Postow

Reputation: 12187

How do I run DBT models from a Python script or program?

I have a DBT project, and a python script will be grabbing data from the postgresql to produce output.

However, part of the python script will need to make the DBT run. I haven't found the library that will let me cause a DBT run from an external script, but I'm pretty sure it exists. How do I do this?

ETA: The correct answer may be to download the DBT CLI and then use python system calls to use that.... I was hoping for a library, but I'll take what I can get.

Upvotes: 10

Views: 18315

Answers (1)

tconbeer
tconbeer

Reputation: 5805

Update: v1.5 has arrived!

With v1.5 of dbt, we get a stable and officially supported Python API for invoking dbt operations; this API has functional parity with the CLI.

From the docs:

from dbt.cli.main import dbtRunner, dbtRunnerResult

# initialize
dbt = dbtRunner()

# create CLI args as a list of strings
cli_args = ["run", "--select", "tag:my_tag"]

# run the command
res: dbtRunnerResult = dbt.invoke(cli_args)

# inspect the results
for r in res.result:
    print(f"{r.node.name}: {r.status}")

There are some caveats about the stability of artifacts returned by dbt.invoke; read the docs for more details.

Original Answer

(As of Jan 2023) There is not a public Python API for dbt, yet. It is expected in v1.5, which should be out in a couple months.

Right now, your safest option is to use the CLI. If you don't want to use subprocess, the CLI uses Click now, and Click provides a runner that you can use to invoke Click commands. It's usually used for testing, but I think it would work for your use case, too. The CLI command is here. That would look something like:

from click.testing import CliRunner
from dbt.cli.main import run

dbt_runner = CliRunner()
dbt_runner.invoke(run, args="-s my_model")

You could also invoke dbt the way they do in the test suite, using run_dbt.

Upvotes: 16

Related Questions