CFCJB John
CFCJB John

Reputation: 33

Azure Synapse Pipeline Get Metadata Filter by last modified to use date in filename instead of last modified date of file

Within the pipeline's Get Metadata1 activity, I would like to filter the files to be copied by the date inside the filename instead of the date of the file:

example: 20230301_FC_ServiceUsage_Common_Electricity.txt.gz Date Modified on File: 3-02-3023

I want to use the 20230301 date within the filename instead of the date modified on the file itself to filter the files selected for copy.

Get Metadata1 passes Childitems to a ForEach loop.

Get Metadata: Filter by last modified

The date range is entered in a parameter: Get Metadata: BeginRange and EndRange Parameters

Is it possible to do this within the "Filter by last modified" Get Metadata1 or do I need to do this by some other mechanism?

I've tried a few things that have not worked.

Upvotes: 0

Views: 640

Answers (2)

Rakesh Govindula
Rakesh Govindula

Reputation: 11514

You can achieve your requirement using Filter activity as well.

Use the filter activity to filter the file names array from Get Meta data activity.

Here, these are my file names array from Get Meta data activity.

enter image description here

I took the below start and end date.

enter image description here

Then I have converted the start and end dates to integer and saved to integer variables using the below expressions.

start date integer variable:

@int(concat(split(pipeline().parameters.start_date,'-')[2],split(pipeline().parameters.start_date,'-')[1],split(pipeline().parameters.start_date,'-')[0]))

End date integer variable:

@int(concat(split(pipeline().parameters.end_date,'-')[2],split(pipeline().parameters.end_date,'-')[1],split(pipeline().parameters.end_date,'-')[0]))

Now, take a filter activity and give Get meta data child items array in the items of filter activity.

In the condition give the below expression.

@and(greaterOrEquals(variables('end_date_int'), int(split(item().name,'_')[0])),greaterOrEquals(int(split(item().name,'_')[0]), variables('start_date_int')))

enter image description here

Filter activity output:

enter image description here

NOTE: For the comparison in the filter activity expression, the dates start date, end date and date from file name should be in yyyyMMdd format and should be converted to integer.

So, if your file name is not in yyyyMMdd format, use substring() and concat after splitting to get in yyyyMMdd format.

Upvotes: 1

Nandan
Nandan

Reputation: 4945

Please follow the below steps:

  1. get list of all files via get mera data activity and pass it through for each
  2. within for each, use if activity to check the range wherein item().name is that iterations entire file name. You can use split function via _ which would convert the filename into array and then get the 1st element within array and use it to compare the date range

Upvotes: 0

Related Questions