MichaelChirico
MichaelChirico

Reputation: 34743

Is Variable.get('x', deserialize_json=True) the same as Jinja '{{ var.json.x }}'?

I have a Variable stored like this:

x | {"a": 1, "b": 2}

I want to retrieve this full JSON as a dict in my DAG script. The following works:

from airflow.models import Variable
op = CustomOperator(
  templated_field = Variable.get('x', deserialize_json = True)
)
# called in CustomOperator:
templated_field.keys()
# dict_keys(['a', 'b'])

The documentation suggests the following should be equivalent:

The second call [using Variable.get(deserialize_json=True)] assumes json content and will be deserialized...

...or if you need to deserialize a json object from the variable [using a Jinja template]:

{{ var.json.<variable_name> }}

However, when I use this and then try to extract the keys, I get error:

op = CustomOperator(
  templated_field = '{{ var.json.x }}'
)
# called in CustomOperator
templated_field.keys()

AttributeError: 'str' object has no attribute 'keys'

The error suggests actually the json was not deserialized:

# same error
"{'a': 1, 'b': 2}".keys()

The only examples of using the var.json approach I've found online don't extract a dict from JSON, but rather use a JSON path to extract a scalar:

# https://www.applydatascience.com/airflow/airflow-variables/
{{ var.json.example_variables_config.var3 }}
# https://gist.github.com/kaxil/61f41dd87a69230d1a637dc3a1d2fa2c
{{ var.json.dag1_config.var1 }}

Hossein helpfully points out that this field should be templated; here's the line from CustomOperator:

class CustomOperator(SuperCustomOperator):
    template_fields = SuperCustomOperator.template_fields + ('template_field',)

Am I missing something? Is the Jinja var.json approach only suitable for extracting scalars, or can it be used to extract JSON-as-dict as well?

Upvotes: 2

Views: 7086

Answers (2)

MH_void
MH_void

Reputation: 11

I found this answer which helped me with a similar issue of passing a dictionary as variable. The problem is that Jinja2's {{ var.json.x }} echo's the value of var.json.x, so it converts the dict to str which as a side effect changes double-quotes to single-quotes in the json-like str, so you can't directly load the string as json-string.

My solution was adding a custom filter, which renders the dictionary as json string (double quoted where needed) and then I can convert it back to dictionary inside my function.

The code:

dag = DAG(
    ...
    user_defined_filters={'tojson': lambda s: json.dumps(s)},
)

op = PythonOperator(
    ...
    op_kwargs = { "input_dict" : {{ var.json.x | tojson }} }
    # also applicable to other templated fields
    python_callable = func
)

def func(input_dict):
    input_dict = json.loads(input_dict)
    ...

It works as expected, and now I can use input_dict as dictionary

Upvotes: 1

Hossein Torabi
Hossein Torabi

Reputation: 733

Actually you should use it on the templated fileds ! The templated field are the fields that renders by jinaja, For example assume this BashOperator:

Task = BashOperator(
    task_id='bash_task',
    bash_command='echo {{ var.json.x }}', 
    dag=dag,
)

Upvotes: 0

Related Questions