MarkD
MarkD

Reputation: 4944

Combining Pandas Dataframe with indexed value as column name

Given the following two pandas DataFrames:

main_table = pd.DataFrame([[1, 'A'],
                           [2, 'B'],
                           [3,'C']],
                           columns=['id', 'label'])

extras_table = pd.DataFrame([[1, 'e1', 'e1_Val'], 
                             [1, 'e2', 'e2_Val'],
                             [2, 'e2', 'e2_Val2'],
                             [3, 'e1', 'e1_val3']], 
                             columns=['main_id', 'col_label', 'value'])

I want to use the 'main_id' column of extras_table, and 'col_label' to create additional columns on main_table. i.e.:

result:
   id label       e1        e2
0   1     A   e1_Val    e2_Val
1   2     B     None   e2_Val2
2   3     C  e1_val3      None

Note that some of the rows may not have all of the new columns. Is this possible in Pandas, without iterating over extras table, and adding the new columns/values?

Upvotes: 1

Views: 35

Answers (1)

piRSquared
piRSquared

Reputation: 294238

xdf = extras_table.set_index(['main_id', 'col_label']) \
    .unstack().value.reset_index('main_id')

main_table.merge(xdf, left_on='id', right_on='main_id').drop('main_id', axis=1)

enter image description here

Upvotes: 3

Related Questions