Reputation: 5661
I have a DAG has been running for a while. Now I more old data available and want to backfill.
I change my parameters:
default_args = {
'owner': 'drum',
'depends_on_past': False,
'start_date': datetime(2019, 7, 1),
'retries': 2,
'retry_delay': timedelta(minutes=5)
}
dag = DAG(
dag_id='dag_one',
catchup=False,
default_args=default_args,
schedule_interval='@weekly',
max_active_runs=1
)
To:
default_args = {
'owner': 'drum',
'depends_on_past': False,
'start_date': datetime(2018, 1, 1), ### Update
'retries': 2,
'retry_delay': timedelta(minutes=5)
}
dag = DAG(
dag_id='dag_one',
catchup=True, ### Update
default_args=default_args,
schedule_interval='@weekly',
max_active_runs=1
)
However this does not trigger the backfill. I am using the GUI explicitly as I do not have access to the terminal.
Upvotes: 0
Views: 337
Reputation: 799
As I remember, you also need to update your dag_id
(e. g. to dag_one_v2
) when changing start_date
. But be careful as updating the dag_id
will lead to losing all dag's metadata. So Airflow will re-execute all dags since 2019-07-01. You may also need to add some kind of check, whether your data have been already processed or not.
Upvotes: 2