Pandas - Convert two columns into a new column as a dictionary

Question

I'm trying to use Pandas to convert two columns into a column that is a dictionary representation of the two converted columns.

df = DataFrame({'Metrics' : [[("P", "P"), ("Q","Q")], ("K", "K"), ("Z", "Z")], 
                'Stage_Name' : ["P", "K", "Z"],  
                'Block_Name' : ["A", "B", "A"]})

Essentially I want to merge Metrics and Stage_Name:

Into another columns called merged, and for example, the 1st row would be:

{'P': [('P', 'P'), ('Q', 'Q')]}

I know how to convert one row into a dictionary representation, however, I'm not sure how do I do this to all rows without a for loop:

something = df.iloc[[0]].set_index('Stage_Name')['Metrics'].to_dict()
print something
Output: {'P': [('P', 'P'), ('Q', 'Q')]}

Later I would want to aggregate based on Block_Name, so for a merged column, the result would be two dictionaries added together for Block_Name : A.

{'P': [('P', 'P'), ('Q', 'Q')], 'Z' : [('Z', 'Z')] }

For Stage_Name and Metrics, I'll just have it appended to a list, which looks like this:

grouped = df.groupby(df['Block_Name'])
df_2 = grouped.aggregate(lambda x: tuple(x))

Can someone point me to the right direction? Thanks!

EdChum · Accepted Answer

IIUC correctly then you use apply with a lambda:

In [19]:
df['merged'] = df.apply(lambda row: {row['Stage_Name']:row['Metrics']}, axis=1)
df

Out[19]:
  Block_Name           Metrics Stage_Name                           merged
0          A  [(P, P), (Q, Q)]          P  {'P': [('P', 'P'), ('Q', 'Q')]}
1          B            (K, K)          K                {'K': ('K', 'K')}
2          A            (Z, Z)          Z                {'Z': ('Z', 'Z')}

Pandas - Convert two columns into a new column as a dictionary

Answers (2)

Related Questions