Reputation: 4944
Given the following two pandas DataFrames:
main_table = pd.DataFrame([[1, 'A'],
[2, 'B'],
[3,'C']],
columns=['id', 'label'])
extras_table = pd.DataFrame([[1, 'e1', 'e1_Val'],
[1, 'e2', 'e2_Val'],
[2, 'e2', 'e2_Val2'],
[3, 'e1', 'e1_val3']],
columns=['main_id', 'col_label', 'value'])
I want to use the 'main_id' column of extras_table, and 'col_label' to create additional columns on main_table. i.e.:
result:
id label e1 e2
0 1 A e1_Val e2_Val
1 2 B None e2_Val2
2 3 C e1_val3 None
Note that some of the rows may not have all of the new columns. Is this possible in Pandas, without iterating over extras table, and adding the new columns/values?
Upvotes: 1
Views: 35
Reputation: 294238
xdf = extras_table.set_index(['main_id', 'col_label']) \
.unstack().value.reset_index('main_id')
main_table.merge(xdf, left_on='id', right_on='main_id').drop('main_id', axis=1)
Upvotes: 3