soiryk139
soiryk139

Reputation: 85

Parsing Azure Blob Queue Message in Azure Data Factory

I use the Azure Data Factory Web Activity to get Queue Messages from azure blob storage.

{
    "Response": "
<?xml version=\"1.0\" encoding=\"utf-8\"?>
<QueueMessagesList>
    <QueueMessage>
        <MessageId>12345678</MessageId>
        <InsertionTime>Mon, 12 Jun 2023 07:31:05 GMT</InsertionTime>
        <ExpirationTime>Mon, 19 Jun 2023 07:31:05 GMT</ExpirationTime>
        <PopReceipt>AgAAAAMAAAAAAAAAT8iLmwid2QE=</PopReceipt>
        <TimeNextVisible>Mon, 12 Jun 2023 08:33:47 GMT</TimeNextVisible>
        <DequeueCount>25</DequeueCount>
        <MessageText>12062023/test1.csv</MessageText>
    </QueueMessage>
    <QueueMessage>
        <MessageId>45678910</MessageId>
        <InsertionTime>Mon, 12 Jun 2023 07:30:42 GMT</InsertionTime>
        <ExpirationTime>Mon, 19 Jun 2023 07:30:42 GMT</ExpirationTime>
        <PopReceipt>AgAAAAMAAAAAAAAAT8iLmwid2QE=</PopReceipt>
        <TimeNextVisible>Mon, 12 Jun 2023 08:33:47 GMT</TimeNextVisible>
        <DequeueCount>25</DequeueCount>
        <MessageText>12062023/test2.csv</MessageText>
    </QueueMessage>
    <QueueMessage>
        <MessageId>1112012323</MessageId>
        <InsertionTime>Mon, 12 Jun 2023 08:33:08 GMT</InsertionTime>
        <ExpirationTime>Mon, 19 Jun 2023 08:33:08 GMT</ExpirationTime>
        <PopReceipt>AgAAAAMAAAAAAAAAT8iLmwid2QE=</PopReceipt>
        <TimeNextVisible>Mon, 12 Jun 2023 08:33:47 GMT</TimeNextVisible>
        <DequeueCount>1</DequeueCount>
        <MessageText>11062023/test3.csv</MessageText>
    </QueueMessage>
</QueueMessagesList>",
    "ADFWebActivityResponseHeaders": {
        "Transfer-Encoding": "chunked",
        "x-ms-request-id": "jksasjkacas",
        "x-ms-version": "2020-04-08",
        "Cache-Control": "no-cache",
        "Date": "Mon, 12 Jun 2023 08:33:17 GMT",
        "Server": "Windows-Azure-Queue/1.0;Microsoft-HTTPAPI/2.0",
        "Content-Type": "application/xml"
    },
    "effectiveIntegrationRuntime": "Runtime2",
    "executionDuration": 0,
    "durationInQueue": {
        "integrationRuntimeQueue": 0
    },
    "billingReference": {
        "activityType": "ExternalActivity",
        "billableDuration": [
            {
                "meterType": "AzureIR",
                "duration": 0.016666666666666666,
                "unit": "Hours"
            }
        ]
    }
}

I would like to get this output and feed each MessageText into ForEach and run next pipeline process.

In ForEach Items, i used:

@split(activity('get_queue_message').output.Response, '</MessageText>')

Within ForEach i have a SetVariable with the following value:

@split(item(), '<MessageText>')[1]

This configuration manages to return me the 3 MessageText in the sample above. But it always have an extra Set Variable run that results in error:

The expression 'split(item(), '')1' cannot be evaluated because array index '1' is >outside bounds (0, 0) of array.

enter image description here

Please can you help to see where I have done wrong. Thank you.

Upvotes: 0

Views: 188

Answers (1)

Saideep Arikontham
Saideep Arikontham

Reputation: 6104

  • I have taken the given response as a parameter value of object type. When I have used the same dynamic content, it got the same error:

enter image description here

  • This is because the 1st index element of the array we are trying to access does not exist. The following is the value of this array (the one where the step fails):
{ 
    "variableName": "tp2", 
    "value": " </QueueMessage> </QueueMessagesList>" 
}

enter image description here

  • Since we are using @split(item(), '<MessageText>')[1] to split and there is no <MessageText> in the string, the error is occurring. So, check if the <MessageText> string is present or not before splitting the string to avoid errors.

  • I have used the following dynamic content instead of @split(item(), '<MessageText>')[1] to get the actual output:

@if(contains(item(),'<MessageText>'),split(item(),'<MessageText>')[1],'no message text')

enter image description here

  • When you run the pipeline, you get all the required information, but when there is no <MessageText>, the above condition assigns the variable the value of no message text.

enter image description here

  • The following is the complete pipeline JSON:
{
    "name": "pipeline1",
    "properties": {
        "activities": [
            {
                "name": "ForEach1",
                "type": "ForEach",
                "dependsOn": [],
                "userProperties": [],
                "typeProperties": {
                    "items": {
                        "value": "@split(pipeline().parameters.response.Response,'</MessageText>')",
                        "type": "Expression"
                    },
                    "isSequential": true,
                    "activities": [
                        {
                            "name": "1st index array value",
                            "type": "SetVariable",
                            "dependsOn": [
                                {
                                    "activity": "current string",
                                    "dependencyConditions": [
                                        "Succeeded"
                                    ]
                                }
                            ],
                            "userProperties": [],
                            "typeProperties": {
                                "variableName": "tp",
                                "value": {
                                    "value": "@if(contains(item(),'<MessageText>'),split(item(),'<MessageText>')[1],'no message text')",
                                    "type": "Expression"
                                }
                            }
                        },
                        {
                            "name": "current string",
                            "type": "SetVariable",
                            "dependsOn": [],
                            "userProperties": [],
                            "typeProperties": {
                                "variableName": "tp2",
                                "value": {
                                    "value": "@item()",
                                    "type": "Expression"
                                }
                            }
                        }
                    ]
                }
            }
        ],
        "parameters": {
            "response": {
                "type": "object",
                "defaultValue": {
                    "Response": " <?xml version=\"1.0\" encoding=\"utf-8\"?> <QueueMessagesList>     <QueueMessage>         <MessageId>12345678</MessageId>         <InsertionTime>Mon, 12 Jun 2023 07:31:05 GMT</InsertionTime>         <ExpirationTime>Mon, 19 Jun 2023 07:31:05 GMT</ExpirationTime>         <PopReceipt>AgAAAAMAAAAAAAAAT8iLmwid2QE=</PopReceipt>         <TimeNextVisible>Mon, 12 Jun 2023 08:33:47 GMT</TimeNextVisible>         <DequeueCount>25</DequeueCount>         <MessageText>12062023/test1.csv</MessageText>     </QueueMessage>     <QueueMessage>         <MessageId>45678910</MessageId>         <InsertionTime>Mon, 12 Jun 2023 07:30:42 GMT</InsertionTime>         <ExpirationTime>Mon, 19 Jun 2023 07:30:42 GMT</ExpirationTime>         <PopReceipt>AgAAAAMAAAAAAAAAT8iLmwid2QE=</PopReceipt>         <TimeNextVisible>Mon, 12 Jun 2023 08:33:47 GMT</TimeNextVisible>         <DequeueCount>25</DequeueCount>         <MessageText>12062023/test2.csv</MessageText>     </QueueMessage>     <QueueMessage>         <MessageId>1112012323</MessageId>         <InsertionTime>Mon, 12 Jun 2023 08:33:08 GMT</InsertionTime>         <ExpirationTime>Mon, 19 Jun 2023 08:33:08 GMT</ExpirationTime>         <PopReceipt>AgAAAAMAAAAAAAAAT8iLmwid2QE=</PopReceipt>         <TimeNextVisible>Mon, 12 Jun 2023 08:33:47 GMT</TimeNextVisible>         <DequeueCount>1</DequeueCount>         <MessageText>11062023/test3.csv</MessageText>     </QueueMessage> </QueueMessagesList>",
                    "ADFWebActivityResponseHeaders": {
                        "Transfer-Encoding": "chunked",
                        "x-ms-request-id": "jksasjkacas",
                        "x-ms-version": "2020-04-08",
                        "Cache-Control": "no-cache",
                        "Date": "Mon, 12 Jun 2023 08:33:17 GMT",
                        "Server": "Windows-Azure-Queue/1.0;Microsoft-HTTPAPI/2.0",
                        "Content-Type": "application/xml"
                    },
                    "effectiveIntegrationRuntime": "Runtime2",
                    "executionDuration": 0,
                    "durationInQueue": {
                        "integrationRuntimeQueue": 0
                    },
                    "billingReference": {
                        "activityType": "ExternalActivity",
                        "billableDuration": [
                            {
                                "meterType": "AzureIR",
                                "duration": 0.016666666666666666,
                                "unit": "Hours"
                            }
                        ]
                    }
                }
            }
        },
        "variables": {
            "tp": {
                "type": "String"
            },
            "tp2": {
                "type": "String"
            }
        },
        "annotations": []
    }
}

Upvotes: 0

Related Questions