Reputation: 31
I am trying to add some validation to my ADF pipeline. Is there a way to achieve the following validation in ADF?
Upvotes: 1
Views: 809
Reputation: 11454
Adding to @Nandan, you can use Get Meta data activity structure like below.
This is my repro for your reference:
First, I have used 2 parameters for column names and Data types.
Get Meta data activity:
Get Meta activity output array:
Then I have created two arrays to get the above names and columns using forEach.
Then I have used two filter activities to filter the above parameter arrays.
The used if activity to check the parameter arrays length and filter activity output arrays lengths.
If its true, the inside True activities you can use your copy activity or Data flow as per your requirement. Inside False activities, use a fail activity.
My pipeline JSON:
{
"name": "pipeline1",
"properties": {
"activities": [
{
"name": "Get Metadata1",
"type": "GetMetadata",
"dependsOn": [],
"policy": {
"timeout": "0.12:00:00",
"retry": 0,
"retryIntervalInSeconds": 30,
"secureOutput": false,
"secureInput": false
},
"userProperties": [],
"typeProperties": {
"dataset": {
"referenceName": "Excel1",
"type": "DatasetReference"
},
"fieldList": [
"structure"
],
"storeSettings": {
"type": "AzureBlobFSReadSettings",
"enablePartitionDiscovery": false
}
}
},
{
"name": "Filtering names",
"type": "Filter",
"dependsOn": [
{
"activity": "Getting names and columns as list",
"dependencyConditions": [
"Succeeded"
]
}
],
"userProperties": [],
"typeProperties": {
"items": {
"value": "@pipeline().parameters.names",
"type": "Expression"
},
"condition": {
"value": "@contains(variables('namesvararray'),item())",
"type": "Expression"
}
}
},
{
"name": "Filtering types",
"type": "Filter",
"dependsOn": [
{
"activity": "Filtering names",
"dependencyConditions": [
"Succeeded"
]
}
],
"userProperties": [],
"typeProperties": {
"items": {
"value": "@variables('typevararray')",
"type": "Expression"
},
"condition": {
"value": "@contains(variables('typevararray'), item())",
"type": "Expression"
}
}
},
{
"name": "Getting names and columns as list",
"type": "ForEach",
"dependsOn": [
{
"activity": "Get Metadata1",
"dependencyConditions": [
"Succeeded"
]
}
],
"userProperties": [],
"typeProperties": {
"items": {
"value": "@activity('Get Metadata1').output.structure",
"type": "Expression"
},
"isSequential": true,
"activities": [
{
"name": "Append names",
"type": "AppendVariable",
"dependsOn": [],
"userProperties": [],
"typeProperties": {
"variableName": "namesvararray",
"value": {
"value": "@item().name",
"type": "Expression"
}
}
},
{
"name": "Append types",
"type": "AppendVariable",
"dependsOn": [
{
"activity": "Append names",
"dependencyConditions": [
"Succeeded"
]
}
],
"userProperties": [],
"typeProperties": {
"variableName": "typevararray",
"value": {
"value": "@item().type",
"type": "Expression"
}
}
}
]
}
},
{
"name": "If Condition1",
"type": "IfCondition",
"dependsOn": [
{
"activity": "Filtering types",
"dependencyConditions": [
"Succeeded"
]
}
],
"userProperties": [],
"typeProperties": {
"expression": {
"value": "@and(equals(length(pipeline().parameters.names),activity('Filtering names').output.FilteredItemsCount),equals(length(pipeline().parameters.columns),activity('Filtering types').output.FilteredItemsCount))",
"type": "Expression"
},
"ifFalseActivities": [
{
"name": "Fail1",
"type": "Fail",
"dependsOn": [],
"userProperties": [],
"typeProperties": {
"message": "Some of the headers or types are not as required",
"errorCode": "240"
}
}
],
"ifTrueActivities": [
{
"name": "Set variable1",
"type": "SetVariable",
"dependsOn": [],
"userProperties": [],
"typeProperties": {
"variableName": "sample",
"value": "All good"
}
}
]
}
}
],
"parameters": {
"names": {
"type": "array",
"defaultValue": [
"A",
"B",
"C"
]
},
"columns": {
"type": "array",
"defaultValue": [
"String",
"String",
"String"
]
}
},
"variables": {
"namesvararray": {
"type": "Array"
},
"typevararray": {
"type": "Array"
},
"sample": {
"type": "String"
}
},
"annotations": []
}
}
My pipeline failed and got error:
Upvotes: 1
Reputation: 4945
you can use a lookup activity on the dataset and return 1st row(with dataset header property disabled)this would give you the list of columns present in the excel file which you can then compare against the expected values, if the values/sequence match you can proceed further else you can thro error. Note: you can also use Get meta data activity to get the column details
For data type, you can use column patterns in dataflows: https://learn.microsoft.com/en-us/azure/data-factory/concepts-data-flow-column-pattern
@rakeshGovindula: any more thoughts?
Upvotes: 0