Kspr
Kspr

Reputation: 685

Parsing a Pandas DataFrame to nested dictionaries

I have a DataFrame on the following format

 Path    |    Val 
 A/B           1
 A/C           3 
A/D/E          5
 A/E           7
 F/G           9
 ...          ...

And I want to transform this into a nested dictionary where they level of keys will be the Path variable separated by "/".

The expected output would be

 d =  {'A' : 
            {'B' : 1, 'C' : 3} , 
             'D' : 
                  {'E' : 5}, 
             'E' : 7},
       'F' : 
            {'G' : 9}
      }

What would be an efficient way of doing this?

Upvotes: 0

Views: 59

Answers (1)

Prathamesh Sawant
Prathamesh Sawant

Reputation: 137

Seperating the Column Path

df[['main_path','Subpath1','Subpath2']]=df["Path"].str.split("/",expand=True)

enter image description here

Creating new dataframe with subpath1 group

df1 = df.groupby(['Subpath1']).agg({'Val': list, 'main_path': 'first'}).reset_index()

enter image description here

Creating another dataframe with subpath2 group

df2 = df.groupby(['Subpath2']).agg({'Val': list, 'Subpath1': 'first'})

enter image description here

Create dict for subpath1

df2 = df2.groupby(['Subpath1']).agg(dict).reset_index()

enter image description here

Merging the Dict to df1

df3 = pd.merge(df1,df2,on=['Subpath1'], how='left')

enter image description here

Replace NaN value from Val_y with Val_x value

df3['Val'] = df3['Val_y'].fillna(df3['Val_x'])

enter image description here

Droping columns

df3 = df3.drop(['Val_x', 'Val_y'], axis=1)

# Setting subpath1 as index
df3 = df3.set_index('Subpath1')

enter image description here

Final dict

df4 = df3.groupby(['main_path']).agg(dict)
df4.to_json(orient='columns')

Output

{"Val":{"A":{"B":[1],"C":[3],"D":{"E":[5]},"E":[7]},"F":{"G":[9]}}}

Upvotes: 1

Related Questions