Reputation: 685
I have a DataFrame on the following format
Path | Val
A/B 1
A/C 3
A/D/E 5
A/E 7
F/G 9
... ...
And I want to transform this into a nested dictionary where they level of keys will be the Path variable separated by "/".
The expected output would be
d = {'A' :
{'B' : 1, 'C' : 3} ,
'D' :
{'E' : 5},
'E' : 7},
'F' :
{'G' : 9}
}
What would be an efficient way of doing this?
Upvotes: 0
Views: 59
Reputation: 137
df[['main_path','Subpath1','Subpath2']]=df["Path"].str.split("/",expand=True)
df1 = df.groupby(['Subpath1']).agg({'Val': list, 'main_path': 'first'}).reset_index()
df2 = df.groupby(['Subpath2']).agg({'Val': list, 'Subpath1': 'first'})
df2 = df2.groupby(['Subpath1']).agg(dict).reset_index()
df3 = pd.merge(df1,df2,on=['Subpath1'], how='left')
df3['Val'] = df3['Val_y'].fillna(df3['Val_x'])
df3 = df3.drop(['Val_x', 'Val_y'], axis=1)
# Setting subpath1 as index
df3 = df3.set_index('Subpath1')
df4 = df3.groupby(['main_path']).agg(dict)
df4.to_json(orient='columns')
{"Val":{"A":{"B":[1],"C":[3],"D":{"E":[5]},"E":[7]},"F":{"G":[9]}}}
Upvotes: 1