Reputation: 11067
I have a nested dictionary with 3 levels,
example_d = {'attribute_001': {'colour': {'blue': 5, 'green': 5, 'red': 5},
'country': {'France': 3, 'Germany': 3, 'India': 3, 'UK': 3, 'USA': 3}},
'attribute_002': {'colour': {'blue': 5, 'green': 5, 'red': 5},
'country': {'France': 3, 'Germany': 3, 'India': 3, 'UK': 3, 'USA': 3}},
'attribute_003': {'colour': {'blue': 5, 'green': 5, 'red': 5},
'country': {'France': 3, 'Germany': 3, 'India': 3, 'UK': 3, 'USA': 3}},
'attribute_004': {'colour': {'blue': 5, 'green': 5, 'red': 5},
'country': {'France': 3, 'Germany': 3, 'India': 3, 'UK': 3, 'USA': 3}},
'attribute_005': {'colour': {'blue': 5, 'green': 5, 'red': 5},
'country': {'France': 3, 'Germany': 3, 'India': 3, 'UK': 3, 'USA': 3}}}
and I want to move this into a pandas dataframe such that the row-index is sourced at the first level of my dictionary, and use the remaining levels as hierarchical column indices.
I can get close by using the following, adapting an answer from here:
pd.concat({key:pd.DataFrame.from_dict(example_d[key],orient='columns')
for key in example_d.keys()}).unstack(1)
which gives me:
But I need the lowest level of the multi-level column indices to respect their parents.
i.e. under the colour
heading, I want only colour-columns to appear, and under the country
heading, I want to see only country-columns.
Upvotes: 2
Views: 596
Reputation: 862521
First change dictionary, pass to Series
constructor and reshape by Series.unstack
:
reform = {(level1_key, level2_key, level3_key): values
for level1_key, level2_dict in example_d.items()
for level2_key, level3_dict in level2_dict.items()
for level3_key, values in level3_dict.items()}
df = pd.Series(reform).unstack(level=[1,2])
print (df)
colour country
blue green red France Germany India UK USA
attribute_001 5 5 5 3 3 3 3 3
attribute_002 5 5 5 3 3 3 3 3
attribute_003 5 5 5 3 3 3 3 3
attribute_004 5 5 5 3 3 3 3 3
attribute_005 5 5 5 3 3 3 3 3
Upvotes: 2
Reputation: 323226
IIUC using concat
df= pd.DataFrame(example_d).T
pd.concat([df[x].apply(pd.Series) for x in list(df)],1,keys=list(df))
Out[540]:
colour country
blue green red France Germany India UK USA
attribute_001 5 5 5 3 3 3 3 3
attribute_002 5 5 5 3 3 3 3 3
attribute_003 5 5 5 3 3 3 3 3
attribute_004 5 5 5 3 3 3 3 3
attribute_005 5 5 5 3 3 3 3 3
Upvotes: 1