Reputation: 16716
While writing an Airflow DAG, I noticed that prev_execution_date_success
will return None
when job is fresh and has never run previously. Since this was breaking SQL query in major way, I decided to provide a custom handler via user_defined_macros
, this is how it looks:
def __get_last_execution_time(execution_date: str) -> str:
return (datetime.now() - timedelta(hours=1)).isoformat() if execution_date is None else execution_date
And this is how it is being invoked:
WHERE created_at >= TIMESTAMP('{{ get_last_execution_time(prev_execution_date_success) }}')
Very simple. However it returns None
at all times, even when prev_execution_date_success
(and therefore execution_date
) is None
. To me it doesn't make any sense. But as I'm not a Python expert I have a question - can None be some other None? Or what hypothetically could be happening in the context of the Airflow DAG that would break a None checking logic?
UPDATE:
__get_last_execution_time
is definitely executed, I have some logging in it, like this:
logging.info("prev_execution_date_success: %s", execution_date)
logging.info("test 1: %s", execution_date is None)
logging.info("test 2: %s", execution_date == 'None')
logging.info("Type of execution_date: %s", type(execution_date))
And the output is:
prev_execution_date_success: None
test 1: False
test 2: False
Type of execution_date: <class 'Proxy'>
Upvotes: 1
Views: 867
Reputation: 9308
This is the issue on a certain version of Airflow (ex: 2.2.2).
Update to the latest version (2.3.2). Airflow fixed this issue to return the raw value. (ref: github.com/apache/airflow/issues/19716)
Add __wrapped__
to obtain the raw value from the lazy-object-proxy class.
print(execution_date is None) # False, execution_date is a Proxy instance
print(execution_date.__wrapped__ is None) # True, the raw value is null
Upvotes: 3