Combining Pandas Dataframe with indexed value as column name

Question

Given the following two pandas DataFrames:

main_table = pd.DataFrame([[1, 'A'],
                           [2, 'B'],
                           [3,'C']],
                           columns=['id', 'label'])

extras_table = pd.DataFrame([[1, 'e1', 'e1_Val'], 
                             [1, 'e2', 'e2_Val'],
                             [2, 'e2', 'e2_Val2'],
                             [3, 'e1', 'e1_val3']], 
                             columns=['main_id', 'col_label', 'value'])

I want to use the 'main_id' column of extras_table, and 'col_label' to create additional columns on main_table. i.e.:

result:
   id label       e1        e2
0   1     A   e1_Val    e2_Val
1   2     B     None   e2_Val2
2   3     C  e1_val3      None

Note that some of the rows may not have all of the new columns. Is this possible in Pandas, without iterating over extras table, and adding the new columns/values?

piRSquared · Accepted Answer

xdf = extras_table.set_index(['main_id', 'col_label']) \
    .unstack().value.reset_index('main_id')

main_table.merge(xdf, left_on='id', right_on='main_id').drop('main_id', axis=1)

Combining Pandas Dataframe with indexed value as column name

Answers (1)

Related Questions