Simplefish
Simplefish

Reputation: 1130

What happens if an airflow DAG is changed?

Airflow monitors the DAG location for new DAGs and picks them up (every minute or so) without needing a restart.

What happens if an updated dag definition is uploaded to the dag location?

Suppose I have a dag named "foodag" which generates 1 file and is run hourly on the hour. At exactly 0100 hrs I deploy a new version of "foodag" which now generates 2 files. There is a run currently starting at 0100 and another one at 0200.

How many files will the run at 0100 generate? Are there any race conditions here? What about the one at 0200?

Upvotes: 1

Views: 3367

Answers (2)

Ultra
Ultra

Reputation: 83

Haowen Chan, your question is too terse - and the premise is incorrect. I would strongly suggest reading airflow beginner tutorial and best practices; Udemy has a great courses on this too. This is fundamental to understand before development.

"Suppose I have a dag named 'foodag' " It is not clear if foodag is the filename or the dag id or combination of both. Dag_id and file might both need to be versioned (along with start and end date of the dag) depending on the use case.

  1. is the change a bug fix where past data needs to be fixed?
  2. is this new f(x) that only occurs from here on out?
  3. does past results need to be deterministic if need to run again?

see this Efficient way to deploy dag files on airflow

Upvotes: 1

Zack
Zack

Reputation: 2466

If you deploy a new version of a DAG while the DAG is currently running, the currently running DAG (0100) will run the old version (generating 1 file). The next run (0200) will have the latest version (generating 2 files).

Upvotes: 8

Related Questions