rronan
rronan

Reputation: 93

Simple way to execute tasks defined as a DAG, in python?

I am running a series of tasks which depend one on each other, in a complex manner. I would like to describe these dependencies as a DAG (directed acyclic graph) and execute the graph when needed.

I have been looking at airflow, and wrote a dummy script:

from datetime import datetime

from airflow import DAG
from airflow.operators.python import PythonOperator


def cloud_runner():
    # my typical usage here would be a http call to a service (e.g. gcp cloudrun)
    pass


with DAG(dag_id="my_id", schedule_interval=None, start_date=datetime.max) as dag:
    first_task = PythonOperator(task_id="1", python_callable=cloud_runner)
    second_task = PythonOperator(task_id="2", python_callable=cloud_runner)
    second_task_bis = PythonOperator(task_id="2bis", python_callable=cloud_runner)
    third_task = PythonOperator(task_id="3", python_callable=cloud_runner)

    first_task >> [second_task, second_task_bis] >> third_task

Running the following command does the job:

airflow dags backfill my_id --start-date 2020-01-02

PROBLEM:

My usage will never involve any scheduling / start-date / end-date of any kind. Moreover, my DAG will be executed from a python Flask server.

QUESTION:

Is there a way to achieve the same result without airflow? Or using airflow in a trigger-only mode (without all the scheduling part, airflow.db, etc.), in a standalone python script?

Thanks

Upvotes: 0

Views: 821

Answers (1)

Elad Kalif
Elad Kalif

Reputation: 15961

Airflow is both library and application. DAGs don't have to run in a scheduled manner. You can trigger them on demand with the API/CLI. You can not run a DAG (scheduled or manually triggered) if Airflow application isn't running. Airflow requires the scheduler and meta database to run.

To answer your question - No. You must setup and run Airflow to make DAG running.

Upvotes: 1

Related Questions