Reputation: 37
I have a azure data factory pipeline with a lookup activity that check a JSON file.
The size is like below in azure: Azure Blog Size Screenshot
and when I download it, I see below values for the file. so it's not larger that the value the error states: "The size 5012186 of lookup activity result exceeds the limitation 4194304"
Size of the data as opened in Notepad ++
Also below is the design of my pipeline that gets stuck: Pipeline design - Lookup Activity to Read my model.json file to retrieve metadata
Any ideas on how to tackle this issue? thanks in advance
Upvotes: 1
Views: 6090
Reputation: 8301
As lookup has the limitation of 5000 rows, you can try the below workaround for this.
To overcome this the workaround is as mentioned in Microsoft Document
Design a two-level pipeline where the outer pipeline iterates over an inner pipeline, which retrieves data that doesn't exceed the maximum rows or size.
Possible solution:
First, try to save your files list as JSON files to a folder of Blob storage with the size of 5000 rows.
create Get Metadata activity which can fetch the files from the folder
Get Metadata activity settings
Then create For-each activity to Iterate over files
In for-each activity setting give items as @activity('Get Metadata1').output.childItems
For the files create a dataset and give the folder name manually and for filename use the dataset parameter, which we can give the filename in the lookup inside the parent ForEach.
Lookup activity inside Parent ForEach give the file name as @string(item().name)
Execute Pipeline activity:
Before this create an array parameter in the child pipeline and pass the look up output inside ForEach to that in the Execute Pipeline activity.
Give look up output @activity('Lookup1').output.value
Now create inside the Child Pipeline and give the array parameter to the ForEach as @pipeline().parameters.ok
You can use which ever activity you want inside this ForEach, here I have used append.
Then create result1 variable as array
and give value as @variables('arrayid')
The Output will be the array of all ids in the file
BgULa02.png)
Upvotes: 3