Divyaansh Bajpai
Divyaansh Bajpai

Reputation: 232

Airflow Xcom not getting resolved return task_instance string

I am facing an odd issue with xcom_pull where it is always returning back a xcom_pull string "{{ task_instance.xcom_pull(dag_id = 'cf_test',task_ids='get_config_val',key='http_con_id') }}"

My requirement is simple I have pushed an xcom using python operator and with xcom_pull I am trying to retrieve the value and pass it as an http_conn_id for SimpleHttpOperator but the variable is returning a string instead of resolving xcom_pull value. Python Operator is successfully able to push XCom. enter image description here

Code:

from datetime import datetime

import simplejson as json
from airflow.models import DAG
from airflow.operators.dummy_operator import DummyOperator   
from airflow.operators.python_operator import PythonOperator
from airflow.providers.http.operators.http import SimpleHttpOperator
from google.auth.transport.requests import Request

default_airflow_args = {
    "owner": "divyaansh",
    "depends_on_past": False,
    "start_date": datetime(2022, 5, 18),        
    "retries": 0,
    "schedule_interval": "@hourly",
}

project_configs = {
    "project_id": "test",
    "conn_id": "google_cloud_storage_default",
    "bucket_name": "test-transfer",
    "folder_name": "processed-test-rdf",
}


def get_config_vals(**kwargs) -> dict:
    """
    Get config vals from airlfow variable and store it as xcoms

    """

    task_instance = kwargs["task_instance"]

    task_instance.xcom_push(key="http_con_id", value="gcp_cloud_function")


def generate_api_token(cf_name: str):
    """
    generate token for api request
    """
    import google.oauth2.id_token    
    
    request = Request()

    target_audience = f"https://us-central1-test-a2h.cloudfunctions.net/{cf_name}"

    return google.oauth2.id_token.fetch_id_token(
        request=request, audience=target_audience
    )


with DAG(
    dag_id="cf_test",
    default_args=default_airflow_args,
    catchup=False,
    render_template_as_native_obj=True,
) as dag:

    start = DummyOperator(task_id="start")

    config_vals = PythonOperator(
        task_id="get_config_val", python_callable=get_config_vals, provide_context=True
    )

    ip_data = json.dumps(
        {
            "bucket_name": project_configs["bucket_name"],
            "file_name": "dummy",
            "target_location": "/valid",
        }
    )

    conn_id = "{{ task_instance.xcom_pull(dag_id = 'cf_test',task_ids='get_config_val',key='http_con_id') }}"

    api_token = generate_api_token("new-cp")

    cf_task = SimpleHttpOperator(
        task_id="file_decrypt_and_validate_cf",
        http_conn_id=conn_id,
        method="POST",
        endpoint="new-cp",
        data=json.dumps(
            json.dumps(
                {
                    "bucket_name": "test-transfer",
                    "file_name": [
                        "processed-test-rdf/dummy_20220501.txt",
                        "processed-test-rdf/dummy_20220502.txt",                            
                    ],
                    "target_location": "/valid",
                }
            )
        ),
        headers={
            "Authorization": f"bearer {api_token}",
            "Content-Type": "application/json",                
        },
        do_xcom_push=True,
        log_response=True,
    )

    print("task new-cp", cf_task)   
    

    check_flow = DummyOperator(task_id="check_flow")

    end = DummyOperator(task_id="end")

start >> config_vals >> cf_task >> check_flow >> end

Error Message:

raise AirflowNotFoundException(f"The conn_id `{conn_id}` isn't defined") airflow.exceptions.AirflowNotFoundException: The conn_id `"{{ task_instance.xcom_pull(dag_id = 'cf_test',task_ids='get_config_val',key='http_con_id') }}"` isn't defined

I have tried several different days but nothing seems to be working. Can someone point me to the right direction here.

Airflow-version : 2.2.3 Composer-version : 2.0.11

Upvotes: 0

Views: 1622

Answers (1)

Elad Kalif
Elad Kalif

Reputation: 16079

In SimpleHttpOperator the http_conn_id parameter is not templated field thus you can not use Jinja engine with it. This means that this parameter can not be rendered. So when you pass "{{ task_instance.xcom_pull(dag_id = 'cf_test',task_ids='get_config_val',key='http_con_id') }}" to the operator you expect it to be replaced during runtime with the value stored in Xcom by previous task but in fact Airflow consider it just as a regular string this is also what the exception tells you. It actually try to search a connection with the name of your very long string but couldn't find it so it tells you that the connection is not defined.

To solve it you can create a custom operator:

class MySimpleHttpOperator(SimpleHttpOperator):
    template_fields = SimpleHttpOperator.template_fields + ("http_conn_id",)

Then you should replace SimpleHttpOperator with MySimpleHttpOperator in your DAG.

This change makes the string that you set in http_conn_id to be passed via the Jinja engine. So in your case the string will be replaced with the Xcom value as you expect.

Upvotes: 3

Related Questions