Reputation: 11
I am trying to process some uneven JSON files using PDI (Pentaho) and after trying a lot with the native tools, I figured out that I need to parse the JSON files before they are processed. This is an example for just two rows:
[{
"UID": "34531513",
"identities":
[{
"provider": "facebook",
"providerUID": "123145517",
"isLoginIdentity": true,
"oldestDataUpdatedTimestamp": 145227161126
},
{
"provider": "site",
"providerUID": "321315415153",
"isLoginIdentity": false,
"oldestDataUpdated": "2015-07-14T13:37:43.682Z",
"oldestDataUpdatedTimestamp": 1436881063682
}]
},
{
"UID": "1234155",
"identities":
[{
"provider": "facebook",
"providerUID": "123145517",
"isLoginIdentity": true,
"oldestDataUpdatedTimestamp": 145227161126
}]
}]
The problem here is that under the different values inside Identities I don't have the Key field (UID). But I would like to have different rows for each different Identity without loosing their UID. This way, the new key would be UID+Provider (facebook,site or twitter).
What would you recommend?
Thank you in advance,
Martin
Upvotes: 1
Views: 1280
Reputation: 6998
To solve this in Pentaho you have to chain JSON Inputs.
In your first input get the UID
:
And then decode identities in the second step:
Upvotes: 2