Converting pandas dataframe to dictionary with same keys over multiple rows

Question

I'm trying to create a dictionary from a pandas data frame with key from a column and values from rest of columns. But the problem is I will have same keys in multiple rows and I've read through many other similar SO posts but couldn't find the answer. This is what I have:

df1:

pid  feature_id  feature_value
78          20            1.0
78        1130            3.0
...
91        1148            1.0
92        1153            4.0
92        1154            1.0
...
115       1162            1.0
115       1175            5.0
......

This is what I tried:

df2 = df1.set_index('pid').agg(tuple, 1).to_dict()

But problem is this seems to not take into account the same keys from multiple rows.

What I want is something like this:

{78: [(20, 1.0), (1130, 3.0)]..., 115: [(1162, 1.0), (1175, 5.0)], ...}

Please advise.

Parvesh Kumar · Accepted Answer

def df_to_dict(df):
    # create a dictionary
    d = {}
    # iterate over the rows
    for index, row in df.iterrows():
        # if the key is not in the dictionary, add it
        if row[0] not in d:
            d[int(row[0])] = []
        # add the tuple (row[1], row[2]) to the list associated with the key
        d[row[0]].append((row[1], row[2]))
    return d
print(df_to_dict(df))

Converting pandas dataframe to dictionary with same keys over multiple rows

Answers (2)

Related Questions