Reputation: 34743
I have a Variable
stored like this:
x | {"a": 1, "b": 2}
I want to retrieve this full JSON as a dict in my DAG script. The following works:
from airflow.models import Variable
op = CustomOperator(
templated_field = Variable.get('x', deserialize_json = True)
)
# called in CustomOperator:
templated_field.keys()
# dict_keys(['a', 'b'])
The documentation suggests the following should be equivalent:
The second call [using
Variable.get(deserialize_json=True)
] assumes json content and will be deserialized......or if you need to deserialize a json object from the variable [using a Jinja template]:
{{ var.json.<variable_name> }}
However, when I use this and then try to extract the keys, I get error:
op = CustomOperator(
templated_field = '{{ var.json.x }}'
)
# called in CustomOperator
templated_field.keys()
AttributeError: 'str' object has no attribute 'keys'
The error suggests actually the json was not deserialized:
# same error
"{'a': 1, 'b': 2}".keys()
The only examples of using the var.json
approach I've found online don't extract a dict from JSON, but rather use a JSON path to extract a scalar:
# https://www.applydatascience.com/airflow/airflow-variables/
{{ var.json.example_variables_config.var3 }}
# https://gist.github.com/kaxil/61f41dd87a69230d1a637dc3a1d2fa2c
{{ var.json.dag1_config.var1 }}
Hossein helpfully points out that this field should be templated; here's the line from CustomOperator
:
class CustomOperator(SuperCustomOperator):
template_fields = SuperCustomOperator.template_fields + ('template_field',)
Am I missing something? Is the Jinja var.json
approach only suitable for extracting scalars, or can it be used to extract JSON-as-dict as well?
Upvotes: 2
Views: 7086
Reputation: 11
I found this answer which helped me with a similar issue of passing a dictionary as variable.
The problem is that Jinja2's {{ var.json.x }}
echo's the value of var.json.x
, so it converts the dict
to str
which as a side effect changes double-quotes to single-quotes in the json-like str, so you can't directly load the string as json-string.
My solution was adding a custom filter, which renders the dictionary as json string (double quoted where needed) and then I can convert it back to dictionary inside my function.
The code:
dag = DAG(
...
user_defined_filters={'tojson': lambda s: json.dumps(s)},
)
op = PythonOperator(
...
op_kwargs = { "input_dict" : {{ var.json.x | tojson }} }
# also applicable to other templated fields
python_callable = func
)
def func(input_dict):
input_dict = json.loads(input_dict)
...
It works as expected, and now I can use input_dict
as dictionary
Upvotes: 1
Reputation: 733
Actually you should use it on the templated fileds
! The templated field are the fields that renders by jinaja
, For example assume this BashOperator
:
Task = BashOperator(
task_id='bash_task',
bash_command='echo {{ var.json.x }}',
dag=dag,
)
Upvotes: 0