Reputation: 3490
dag1:
start >> clean >> end
I have a dag where i run a few tasks. But I want to modify it such that the clean
steps only runs if another dag "dag2" is not running at the moment.
Is there any way I can import information regarding my "dag2", check its status and if it is in success mode, I can proceed to the clean
step
Something like this:
start >> wait_for_dag2 >> clean >> end
How can I achieve the wait_for_dag2
part?
Upvotes: 0
Views: 2450
Reputation: 3583
Hossein's Approach is the way people usually go. However if you want to get info about any dag run data, you can use the airlfow functionality to get that info. The following appraoch is good when you do not want(or are not allowed) to modify another dag:
from airflow.models.dagrun import DagRun
from airflow.utils.state import DagRunState
dag_runs = DagRun.find(dag_id='the_dag_id_you_want_to_check')
last_run = dag_runs[-1]
if last_run.state == DagRunState.SUCCESS:
print('the dag run was successfull!')
else:
print('the dag state is -->: ', last_run.state)
Upvotes: 1
Reputation: 5110
There are some different answers depends on what you want to do:
if you have two dags with the same schedule interval, and you want to make the run of second dag waits the same run of first one, you can use ExternalTaskSensor on the last task of the first dag
if you want to run a dag2, after each run of a dag1 even if it's triggered manually, in this case you need to update dag1 and add a TriggerDagRunOperator and set schedule interval of the second to None
I want to modify it such that the clean steps only runs if another dag "dag2" is not running at the moment.
if you have two dags and you don't want to run them in same time to avoid a conflict on an external server/service, you can use one of the first two propositions or just use higher priority for the task of the first dag, and use the same pool (with 1 slot) for the tasks which lead to the conflict, but you will lose the parallelism on these tasks.
Upvotes: 2