Wouter De Raeve
Wouter De Raeve

Reputation: 117

Splitting set of JSON files using Azure Data Factory into multiple files

I am pulling data from an API at regular intervals (ic every 5 mins or more frequently). Example of the returned info:

{
  "timestamp":"2022-09-28T00:33:53Z",
  "data":[
    {
      "id":"bdcb2ad8-9e19-4468-a4f3-b440de3a7b40",
      "value":3,
      "created": "2020-09-28T00:00:00Z"
    },
    {
      "id":"7f8d07eb-433b-404c-a9b3-f1832bdd780f",
      "value":4,
      "created": "2020-09-28T00:00:00Z"
    },
    {
      "id":"7f8d07eb-433b-404c-a9b3-f1832bdd780f",
      "value":6,
      "created": "2020-09-28T00:05:00Z"
    }
  ]
}

So after a certain amount of time I would have a set of files in the landing zone:

etc...

I would like to use ADF to take those files and split them on a certain property. So it would end up in a hierarchy like this:

/bdcb2ad8-9e19-4468-a4f3-b440de3a7b40/2022-09-28T00:33:53Z.json
/bdcb2ad8-9e19-4468-a4f3-b440de3a7b40/2022-09-28T00:43:44Z.json
/7f8d07eb-433b-404c-a9b3-f1832bdd780f/2022-09-28T00:33:53Z.json
/7f8d07eb-433b-404c-a9b3-f1832bdd780f/2022-09-28T00:43:44Z.json

where each file only has the data related to itself.

Any thoughts on how to pull this off? Bonus: if for every run of the ADF I could union the files into a single one, that would be great but not absolutely necessary

Upvotes: 1

Views: 1211

Answers (1)

Nagasai
Nagasai

Reputation: 51

Read the source file.

enter image description here

From the source file Data field will iterate with ForEach activity.

enter image description here

Parameters were provided for the sink dataset with FolderName and FileName.

enter image description here

Provided sink folder path and file path dynamically.

enter image description here

Values for the given parameters were provided in COPY ACTIVITY

enter image description here

enter image description here

Result with Expected files

enter image description here

Upvotes: 1

Related Questions