Reputation: 21
I've been struggling with JSON transformation using Python. I have JSON in the below format:
{
"Children": [{ "child": "Child 0"}],
"Parent": "Parent 10"
},
{
"Children": [{ "child": "Child 1"}],
"Parent": "Parent 10"
},
{
"Children": [{ "child": "Child 2"}],
"Parent": "Parent 11"
},
But instead of having duplicated parents, I would like to merge children together to get that:
{
"Children": [{ "child": "Child 0"}, { "child": "Child 1"}],
"Parent": "Parent 10"
},
{
"Children": [{ "child": "Child 2"}],
"Parent": "Parent 11"
},
Upvotes: 2
Views: 3772
Reputation: 26315
You can also use a collections.defaultdict()
to do this, which can be serialized at the end:
from collections import defaultdict
from json import dumps
data = [
{"Children": [{"child": "Child 0"}], "Parent": "Parent 10"},
{"Children": [{"child": "Child 1"}], "Parent": "Parent 10"},
{"Children": [{"child": "Child 2"}], "Parent": "Parent 11"},
]
d = defaultdict(list)
for dic in data:
parent, children = dic["Parent"], dic["Children"]
d[parent].extend(children)
result = []
for k, v in d.items():
result.append({"Parent": k, "Children": v})
print(dumps(result))
Which gives a JSON array of JSON objects:
[{"Parent": "Parent 10", "Children": [{"child": "Child 0"}, {"child": "Child 1"}]}, {"Parent": "Parent 11", "Children": [{"child": "Child 2"}]}]
You can also group into the data by parent key using a nested defaultdict()
:
d = defaultdict(lambda : defaultdict(list))
for dic in data:
parent, children = dic["Parent"], dic["Children"]
d[parent]["Children"].extend(children)
print(dumps(d))
Which gives this new structure:
{"Parent 10": {"Children": [{"child": "Child 0"}, {"child": "Child 1"}]}, "Parent 11": {"Children": [{"child": "Child 2"}]}}
And will allow easy O(1) lookups for the parent.
Upvotes: 2
Reputation: 1411
Take a look at the itertools groupby function. Here's an example with your data grouped by Parent.
>>> from itertools import groupby
>>> import pprint
>>> data = [{
"Children": [{ "child": "Child 0"}],
"Parent": "Parent 10"
},
{
"Children": [{ "child": "Child 1"}],
"Parent": "Parent 10"
},
{
"Children": [{ "child": "Child 2"}],
"Parent": "Parent 11"
}]
>>> data_grouped = {k: list(v) for k, v in groupby(data, key=lambda x: x["Parent"])}
>>> pp = pprint.PrettyPrinter(indent=4)
>>> pp.pprint(data_grouped)
{ 'Parent 10': [ { 'Children': [{'child': 'Child 0'}],
'Parent': 'Parent 10'},
{ 'Children': [{'child': 'Child 1'}],
'Parent': 'Parent 10'}],
'Parent 11': [{'Children': [{'child': 'Child 2'}], 'Parent': 'Parent 11'}]}
Here I've placed your example dicts inside a list and group by the Parent entry in each dict. This is all wrapped up inside a dict comprehension to give a meaningful output.
Upvotes: 2