Convert pandas dataframe to dictionary with nested dictionary based on 2 key columns and 1 value column

Question

I have a dataframe in pandas as follows:

df = pd.DataFrame({'key1': ['abcd', 'defg', 'hijk', 'abcd'],
                   'key2': ['zxy', 'uvq', 'pqr', 'lkj'],
                   'value': [1, 2, 4, 5]})

I am trying to create a dictionary with a key of key1 and a nested dictionary of key2 and value. I have tried the following:

dct = df.groupby('key1')[['key2', 'value']].apply(lambda x: x.set_index('key2').to_dict(orient='index')).to_dict()

dct

{'abcd': {'zxy': {'value': 1}, 'lkj': {'value': 5}},
 'defg': {'uvq': {'value': 2}},
 'hijk': {'pqr': {'value': 4}}}

Desired output:

{'abcd': {'zxy': 1, 'lkj': 5}, 'defg': {'uvq': 2}, 'hijk': {'pqr': 4}}

jpp · Accepted Answer

Using collections.defaultdict, you can construct a defaultdict of dict objects and add elements while iterating your dataframe:

from collections import defaultdict

d = defaultdict(dict)

for row in df.itertuples(index=False):
    d[row.key1][row.key2] = row.value

print(d)

defaultdict(dict,
            {'abcd': {'lkj': 5, 'zxy': 1},
             'defg': {'uvq': 2},
             'hijk': {'pqr': 4}})

As defaultdict is a subclass of dict, this should require no further work.

Convert pandas dataframe to dictionary with nested dictionary based on 2 key columns and 1 value column

Answers (1)

Related Questions