GelineAstra
GelineAstra

Reputation: 37

How can i aggregate a JSON using ADF Dataflow?

i'm using ADF for the fist time to take datas from a CSV and elaborate them in a JSON

i have a structure that is like:

|father |date |record1 |record2 |...|recordn |
|father |date2|record1A|record2A|...|recordnA|
|father |date3|record1B|record2B|...|recordnA|
|father2|date |record1 |record2 |...|recordn |
|father2|date2|record1A|record2B|...|recordnA|

the json structure i want is something like:

{
    "father": "father"
    "Report": {
        "Date": "2021-11-25",
        "Values": {
            "record1": "record1",
            "record2": "record2",
            "record3": "record3",
            "recordn": "recordn"
        }
    },
    "Report": {
        "Date": "2021-11-26",
        "Values": {
            "record1": "record1A",
            "record2": "record2A",
            "record3": "record3A",
            "recordn": "recordnA"
        }
    },
    "Report": {
        "Date": "2021-11-27",
        "Values": {
            "record1": "record1B",
            "record2": "record2B",
            "record3": "record3B",
            "recordn": "recordnB"
        }
    }
}
{
    "father": "father2"
    "Report": {
        "Date": "2021-11-25",
        "Values": {
            "record1": "record1",
            "record2": "record2",
            "record3": "record3",
            "recordn": "recordn"
        }
    },
    "Report": {
        "Date": "2021-11-26",
        "Values": {
            "record1": "record1A",
            "record2": "record2A",
            "record3": "record3A",
            "recordn": "recordnA"
        }
    }
}

Where:

I'm using a DataFlow Derived columns (please note the screenshot below to see the data flow process) to try to get somewhere. i was able to generate a json with this structure

{
    "STRUCTURE": {
        "FATHER": "FATHER"
        "Report": {
            "Date": "2021-11-03 00:00:00.0000000",
            "Values": {
                "record1": "record1A",
                "record2": "record2A",
                "record3": "record3A",
                "recordn": "recordnA"
            }
        }
    }
}

for each line of the csv. is not clear for me how to create a structure to have a sort of "group by" on my father column. i dont know if i need to change something in my import source step of adf or in the derived column. the thing is that obviusly derived clumns generates a column with the structure i write. so a json for everyline. i don know if there is a way to aggregate. enter image description here i'm open to totally change my prospective and do not use dataflows at all. i know how to use parameters but i don't know how to implement it there

please give me some suggestions

Upvotes: 0

Views: 167

Answers (1)

Aswin
Aswin

Reputation: 7156

To transform the pipe delimited text file as nested Json, follow the below approach.

  • Take the source transformation and take the source dataset in it.

  • Take the derived column transformation and add a new column named Values. The expression for the column Values is, @(each(match(startsWith(name,'record')), $$ = $$))

  • Take another derived column transformation and add a column named Report. The expression for that column is @(date=date, Values=Values)

  • Then take aggregate transformation and group the data by father column.

This way, you can aggregate into Nested JSON.

Upvotes: 1

Related Questions