Reputation: 764
Is it feasible to have a schedule pattern for running a DAG three times in a month - 5th day of the month, 14 days before the month end and on the month end.
Thanks
Upvotes: 1
Views: 494
Reputation: 534
Another option is to specify the date in the Cron expression.
Referring to this answer, we can use L
for the end of the month. However, it cannot determine the 14 days before the end of the month. I'm not sure if it is acceptable to use 14th to represent the last two-week period.
The proposed schedule_interval would be "0 0 5,14,L * *"
. But since Airflow will run at the end of each period, the execution date will show the last month's date. And you will need to make sure that start_date is more than a month for it to start trigger.
To solve the day 14th problem with Python,
import calendar
last_day = calendar.monthrange("{{ execution_date.year }}", "{{ execution_date.month }}")[1]
schedule_interval = f"0 0 5,{last_day - 14},L * *"
or using Pendulum approach together with execution_date
instead:
last_day = "{{ execution_date.days_in_month }}"
schedule_interval = f"0 0 5,{last_day - 14},{last_day} * *"
Don’t forget to add render_template_as_native_obj=True
in DAG for Airflow 2.1 or newer, to make it output as Python Object instead of string. If it’s not working, it will be needed go convert into int.
Reference cron format in Crontab.guru (it doesn't support L
character there, so I use 31 for a place holder)
Upvotes: 1
Reputation: 484
I think the best solution here is to run your DAG every day, then check the execution_date fit with your condition.
Crontab expression only supports the forward day. For the 5th day of the month, it should be '0 0 5 * *', but the backward is impossible.
Upvotes: 1