funnelCONN
funnelCONN

Reputation: 149

Complex JSON transformation in Groovy

This question kinda based on my old question.

Source JSON:

{
  "reports": [
    {
      "columnHeader": {
        "dimensions": [
          "ga:date"
        ],
        "metricHeader": {
          "metricHeaderEntries": [
            {
              "name": "ga:sessions",
              "type": "INTEGER"
            },
            {
              "name": "ga:bounces",
              "type": "INTEGER"
            }
          ]
        }
      },
      "data": {
        "rows": [
          {
            "dimensions": [
              "20210110"
            ],
            "metrics": [
              {
                "values": [
                  "49",
                  "22"
                ]
              }
            ]
          },
          {
            "dimensions": [
              "20210111"
            ],
            "metrics": [
              {
                "values": [
                  "40",
                  "17"
                ]
              }
            ]
          }
        ],
        "totals": [
          {
            "values": [
              "149",
              "64"
            ]
          }
        ],
        "rowCount": 2,
        "minimums": [
          {
            "values": [
              "40",
              "17"
            ]
          }
        ],
        "maximums": [
          {
            "values": [
              "60",
              "25"
            ]
          }
        ],
        "isDataGolden": true
      }
    }
  ]
}

I need to create JSON with metrics array from JSON above. I should loop through data.rows.dimensions and map data.rows.metrics.values to columnHeader.metricHeader.metricHeaderEntries.name (order is saved correctly). Or loop through metricHeaderEntries.name and add to them values and report_date. I don't know which way is right. So, I need an array metrics with fields name, value, report_date as I described above. Also, report_date from source JSON is in invalid format. Value like 20210110 should be converted to 2021-01-10

Expected output (or something like this, but grouped by report_date):

{
  "metrics": [
            {
              "name": "ga:sessions",
              "value": "49",
              "report_date": "2021-01-10"
            },
            {
              "name": "ga:sessions",
              "type": "40",
              "report_date": "2021-01-11"
            },
            {
              "name": "ga:bounces",
              "type": "22",
              "report_date": "2021-01-10"
            },
            {
              "name": "ga:bounces",
              "type": "17",
              "report_date": "2021-01-11"
            }
          ],
    "isDataGolden": true      
}

UPDATE

I tried with this script:


def data = new JsonSlurper().parseText(content)

def report = data.reports[0]
def rows = report.data.rows


def result = [ 
    metrics : rows.indexed().collect{index, value ->
            [ 
                name : report.columnHeader.metricHeader.metricHeaderEntries.name[index],
                value: value.metrics[0].values[index],
                report_date: value.dimensions[0]
                
            ]
        }, 
    isDataGolden : report.isDataGolden 
]

It returns:

{
   "metrics":[
      {
         "name":"ga:sessions",
         "value":"49",
         "report_date":"20210110"
      },
      {
         "name":"ga:bounces",
         "value":"25",
         "report_date":"20210111"
      }
   ],
   "isDataGolden": true
}

So it doesn't loop through all date+metrics combinations. I guess i need a nested loop? But what kind of?

Upvotes: 0

Views: 280

Answers (1)

cfrick
cfrick

Reputation: 37008

collect is always a 1:1 mapping between input and output. So you can not use to to create multiple outputs per input; you have to use collectMany, which does that.

So use collectMany to iterate all rows and then collect over the header/values tuples.

import groovy.json.JsonSlurper
def data = new JsonSlurper().parse("data.json" as File)
def report = data.reports[0]

def headers = report.columnHeader.metricHeader.metricHeaderEntries
println report.data.rows.collectMany{
    [headers, it.metrics[0].values].transpose().collect{ h, v ->
        [name: h.name, value: v, report_date: it.dimensions[0]]
    }
}
// [[name:ga:sessions, 
//   value:49, 
//   report_date:20210110], 
//  [name:ga:bounces, 
//   value:22, 
//   report_date:20210110], 
//  [name:ga:sessions, 
//   value:40, 
//   report_date:20210111], 
//  [name:ga:bounces, 
//   value:17, 
//   report_date:20210111]]

Upvotes: 4

Related Questions