Martin Michalski
Martin Michalski

Reputation: 11

Parsing JSON file for PDI

I am trying to process some uneven JSON files using PDI (Pentaho) and after trying a lot with the native tools, I figured out that I need to parse the JSON files before they are processed. This is an example for just two rows:

[{  
  "UID": "34531513", 
  "identities": 
    [{
      "provider": "facebook",
      "providerUID": "123145517",
      "isLoginIdentity": true,
      "oldestDataUpdatedTimestamp": 145227161126
     },
     {
      "provider": "site",
      "providerUID": "321315415153",
      "isLoginIdentity": false,
      "oldestDataUpdated": "2015-07-14T13:37:43.682Z",
      "oldestDataUpdatedTimestamp": 1436881063682
      }]
},
{
 "UID": "1234155",
 "identities":
      [{
       "provider": "facebook",
       "providerUID": "123145517",
       "isLoginIdentity": true,
       "oldestDataUpdatedTimestamp": 145227161126
       }]
}]

The problem here is that under the different values inside Identities I don't have the Key field (UID). But I would like to have different rows for each different Identity without loosing their UID. This way, the new key would be UID+Provider (facebook,site or twitter).

What would you recommend?

Thank you in advance,

Martin

Upvotes: 1

Views: 1280

Answers (1)

bolav
bolav

Reputation: 6998

To solve this in Pentaho you have to chain JSON Inputs.

Chained JSON Inputs

In your first input get the UID:

First JSON step

And then decode identities in the second step:

Second JSON Step

Upvotes: 2

Related Questions