jd2050
jd2050

Reputation: 302

Apache Airflow DAG with single task

I'm newbie in Apache Airflow. There are a lot of examples of basic DAGs in the Internet. Unfortunately, I didn't find any examples of single-task DAG's.

Most of DAG's examples contain bitshift operator in the end of the .py script, which defines tasks order. For example:

# ...our DAG's code...
task1 >> task2 >> task3

But what if my DAG has just a single task at the moment? My question is - do I need to use this single task's name in the end of Python file? Or if we have only 1 task in the scope, Airflow will handle it itself, and the last line of code below is redundant?

from datetime import timedelta
from airflow.operators.bash import BashOperator
from airflow.utils.dates import days_ago

default_args = {
    'owner': 'airflow',
    'depends_on_past': False,
    'email': ['[email protected]'],
    'email_on_failure': False,
    'email_on_retry': False,
    'retries': 1,
}
with DAG(
    'tutorial',
    default_args=default_args,
    description='A simple tutorial DAG',
    schedule_interval=timedelta(days=1),
    start_date=days_ago(2),
    tags=['example'],
) as dag:

    t1 = BashOperator(
        task_id='print_date',
        bash_command='date',
    )

    t1 # IS THIS LINE OF CODE NECESSARY?

Upvotes: 9

Views: 4118

Answers (1)

NicoE
NicoE

Reputation: 4853

The answer is NO, you don't need to include the last line. You could also avoid the asignment of the variable t1, leaving the DAG like this:

with DAG(
    'tutorial',
    default_args=default_args,
    description='A simple tutorial DAG',
    schedule_interval=timedelta(days=1),
    start_date=days_ago(2),
    tags=['example'],
) as dag:

    BashOperator(
        task_id='print_date',
        bash_command='date',
    )

The reason to perfom the assignment of an instance of an Operator (such as BashOperator), to a variable (called Task in this scope) is similiar to any other object in OOP. In your example there is no other "operation" perfomed over the t1 variable (you are not reading it or consuming any method from it) so there no is no reason to declare it.

When starting with Airflow, I think is very clarifying to use the DebugExecutor to perform quick tests like this and understand how everything is working. If you are using VS Code you can find an example config file, here.

Upvotes: 11

Related Questions