Reputation: 7951
I have this input in JSON, I am having difficulty grouping things together:
[
{
"PK": "123",
"SURNAME": "CHEN",
"SEX": "F",
"DATE_OF_BIRTH": "1962-08-29 00:00:00.0",
"PHONE_TYPE": "05",
"PHONE_NO": "12312312",
"OPERATION": "INSERT",
}, {
"PK": "123",
"SURNAME": "CHEN",
"SEX": "F",
"DATE_OF_BIRTH": "1962-08-29 00:00:00.0",
"PHONE_TYPE": "04",
"PHONE_NO": "78787878",
"OPERATION": "UPDATE"
},{
"PK": "456",
"SURNAME": "DEV",
"SEX": "M",
"DATE_OF_BIRTH": "1953-06-06 00:00:00.0",
"PHONE_TYPE": "05",
"PHONE_NO": "34343434",
"OPERATION": "INSERT"
}, {
"CLIENT_ID": "456",
"SURNAME": "DEV",
"SEX": "M",
"DATE_OF_BIRTH": "1953-06-06 00:00:00.0",
"PHONE_TYPE": "02",
"PHONE_NO": "56565656",
"OPERATION": "DELETE",
}
]
And this is the expected output:
{
"Customers": [{
"MatchingProfile": {
"CustomerNumber": "", // leave blank
"DBType": "Oracle",
"DBKey": "123",
"LastName": "CHEN",
"Gender": "Female",
"Birthdate": "1962-08-29",
},
"Contacts": [{
"ContactType": "Fax",
"CountryCode": "", // leave blank
"Phone_Number": "12312312",
"Status": "Active"
}, {
"ContactType": "Mobile",
"CountryCode": "", // leave blank
"PhoneNumber": "78787878",
"Status": "Active"
}
]
},{
"MatchingProfile": {
"CustomerNumber": "", // leave blank
"DBType": "Oracle",
"DBKey": "456",
"LastName": "DEV",
"Gender": "Male",
"Birthdate": "1953-06-06",
},
"Contacts": [{
"ContactType": "Fax",
"CountryCode": "", // leave blank
"PhoneNumber": "34343434",
"Status": "Active"
}, {
"ContactType": "Office",
"CountryCode": "", // leave blank
"PhoneNumber": "56565656",
"Status": "Inactive"
}
]
}
]
}
The SEX from input is "M", "F", plus some other coded values. Corresponding values for the output Gender is "Male", "Female" and left "" (blank) otherwise. (Don't accuse me of being gender-biased, I know, this is a project requirement, okay? Not my call)
The OPERATION from input that is "INSERT" and "UPDATE" will be a corresponding Status: "Active" ; for "DELETE" it will be Status : "Inactive".
Plus the Birthdate output is truncated equivalent of DATE_OF_BIRTH, minus the time.
The PHONE_TYPE are the following: 02 - "Office", 04 - "Mobile", 05 - "Fax" (I purposedly left out the others).
Is it possible to have a mapping for this in Jolt? Can you show a spec? I'm new with Jolt and I am bit confused. This is 10x harder than Excel Pivot.
Upvotes: 0
Views: 1908
Reputation: 4586
This is pretty much as close as OOTB Jolt can get. Note Jolt is for changing the structure of your data, not doing custom data mappings of things like "PHONE_TYPE": "04" means "Fax".
Transformed Output
{
"Customers" : [ {
"MatchingProfile" : {
"DBKey" : "123",
"Gender" : "F",
"LastName" : "CHEN",
"Birthdate" : "1962-08-29 00:00:00.0",
"Contacts" : [ {
"ContactType" : "05",
"Phone_Number" : "12312312",
"Status" : "INSERT"
}, {
"ContactType" : "04",
"Phone_Number" : "78787878",
"Status" : "UPDATE"
} ]
}
}, {
"MatchingProfile" : {
"DBKey" : "456",
"Gender" : "M",
"LastName" : "DEV",
"Birthdate" : "1953-06-06 00:00:00.0",
"Contacts" : [ {
"ContactType" : "05",
"Phone_Number" : "34343434",
"Status" : "INSERT"
}, {
"ContactType" : "02",
"Phone_Number" : "56565656",
"Status" : "DELETE"
} ]
}
} ]
}
Jolt Spec
[
// first pivot by the value of SURNAME
{
"operation": "shift",
"spec": {
"*": { // for each item in the array
"SURNAME": { // match SURNAME
"*": { // match any value of SURNAME
"@2": "&[]" // copy the whole record from 2 levels up to the SURNAME as an array, so we know that in the next step it is always an array
}
}
}
}
},
{
"operation": "shift",
"spec": {
"*": { // match CHEN or DEV
"0": {
// only pull pk, sex, dob from the first entry of the SURNAME array so as to not duplicate output
"PK": "Customers[#3].MatchingProfile.DBKey",
"SEX": "Customers[#3].MatchingProfile.Gender",
"SURNAME": "Customers[#3].MatchingProfile.LastName",
"DATE_OF_BIRTH": "Customers[#3].MatchingProfile.Birthdate",
// this does mean that the PHONE_TYPE has to be dealt with twice
// once for the zeroth item, and then once again for the rest
"PHONE_TYPE": "Customers[#3].MatchingProfile.Contacts[0].ContactType",
"PHONE_NO": "Customers[#3].MatchingProfile.Contacts[0].Phone_Number",
"OPERATION": "Customers[#3].MatchingProfile.Contacts[0].Status"
},
"*": {
// handle PHONE_TYPE and friends for the other records
"PHONE_TYPE": "Customers[#3].MatchingProfile.Contacts[&1].ContactType",
"PHONE_NO": "Customers[#3].MatchingProfile.Contacts[&1].Phone_Number",
"OPERATION": "Customers[#3].MatchingProfile.Contacts[&1].Status"
}
}
}
}
]
If if you find Jolt valuable for the pivot and the structure change, then your best bet is to "fixup" your input data array, aka map "PHONE_TYPE": "04" to "Fax", trim the 00:00:00 from the birthday, and then use Jolt to make the nested "Customers[].MatchingProfile.Contacts[]" structure.
Upvotes: 2