Reputation: 3299
The objective is to have a single df
given a nested list of dict as below
n=dict(de={'name':'a','status':'aa'},th={'name':'b','status':'bb'},al={'name':'c','status':'cc'})
NESTED_DICT=[dict(CH=dict(bm=[n,n], cm=[n,n], dm=[n,n]),PL=dict(bm=[n,n], cm=[n,n], dm=[n,n])),dict()]
data=[NESTED_DICT for _ in range(3)]
While this objective can be achieve easily using for loop
as below
all_data=[]
for xdata in data:
for con_type in ['CH','PL']:
for condi in [ 'bm','cm','dm']:
ndata=xdata[0][con_type][condi]
df = pd.concat([pd.DataFrame.from_dict(x, orient='index') for x in ndata])
all_data.append(df)
df= pd.concat(all_data)
which produced
name status
de a aa
th b bb
al c cc
de a aa
th b bb
.. ... ...
th b bb
al c cc
de a aa
th b bb
al c cc
[108 rows x 2 columns]
Im looking for more compact and efficient of doing it.
I have come across with json_normalize for Nested Data
which is super compact.
Based on example, I have the impression this can be achieved by something like
# For single subject
data_nested=data[0][0]
df=pd.json_normalize(data_nested,meta=['CH','PL'])
The output is something like
CH.bm ... PL.dm
0 [{'de': {'name': 'a', 'status': 'aa'}, 'th': {... ... [{'de': {'name': 'a', 'status': 'aa'}, 'th': {...
Which is expected.
What parameter should be modified to get something like the nested for loop above?
Upvotes: 1
Views: 138
Reputation: 93191
No. json_normalize
works better if your top level is a dict -- it's an array in this case. And the deeply nested data structure also makes it very challenging for json_normalize
.
You can make the loop more readable with list comprehension:
all_data = [
pd.DataFrame.from_dict(x, orient='index')
for xdata in data
for con_type in ['CH', 'PL']
for condi in ['bm', 'cm', 'dm']
for x in xdata[0][con_type][condi]
]
df = pd.concat(all_data)
Upvotes: 1