Tom
Tom

Reputation: 41

How to compare Get Metadata Structure output with known structure to validate file?

I am loading a blob storage file to SQL based on an event trigger in ADF, and want to validate that the metadata for that file conforms to a known template before running subsequent activities. How would I write an expression in an If Condition to check that the 'structure' output object for the Get Metadata activity matches a known structure?

There are a set of collection functions such as 'contains' which may be applicable, but I don't know how to have the expression compare the output object, which I believe is a list, with the string that represents the column names and types. Below is my non-functional attempt...

@equals(activity('Get Metadata').output.structure, '[{"name": "ID","type": "String"},{"name": "reg_number","type": "String"},...,{"name":"final_column","type":"String"}]')

Struggling to find any examples of metadata validation within Data Factory online that might help with this. The validation activity seems to simply be a traffic light for whether the blob file exists at all.

Upvotes: 2

Views: 2005

Answers (2)

Garry
Garry

Reputation: 21

I came across this post after I implemented the code below to do similar. My idea was to use a control file with a known schema that could be updated if the schema changes without having to modify the pipeline.

@equals( activity('Get Metadata Data').output.structure , activity('Get Metadata KnownFile').output.structure )

I will add this technique Tom's solution to my templates but I will store the string in a setup table which can also be updated when required.

Upvotes: 1

Tom
Tom

Reputation: 41

I identified that the 'structure' output object for Get Metadata activity was an array of json objects, so I used the functions createArray() and json() to compare the objects, which seems to have worked. I'm sure there's a more elegant solution out there though.

@equals(activity('Get Metadata').output.structure,createArray(
        json('{
            "name": "ID",
            "type": "String"
        }'),...
        json('{
            "name": "final_column",
            "type": "String"
        }')
))

Upvotes: 2

Related Questions