Reputation: 17
I have a dataframe and would like to transform the dataframe that the values in one column become new columns with the values from another column.
df = pd.DataFrame({'id': ['1','1','2','2','2','3','3','3','3'],
'name': ['Andi','Andy','Ben','Ben','Benjamin','Charlie','Charlie','Charlie','Charles'],
'event': ['Basket','Basket','Basket','Soccer','Soccer','Basket','Basket','Soccer','Basket'],
'reg_num': ['435','436','123','341','231','223','115','432','67']
})
The result I expected is
Basket Soccer
id Name
1 Andi, Andy 435,436
2 Ben, Benjamin 123 231,341
3 Charlie,Charles 223,115,67 432
I try to group by
df2 = df.reset_index().groupby(['id','name','event'])['reg_num'].aggregate('first').unstack()
but id doesn't merge the name or reg number
Upvotes: 0
Views: 90
Reputation: 862611
Use DataFrame.pivot_table
with add level of MultiIndex by aggregate join
for unique pairs by DataFrame.drop_duplicates
:
s = df.drop_duplicates(['id','name']).groupby('id')['name'].agg(','.join)
print (s)
id
1 Andi,Andy
2 Ben,Benjamin
3 Charlie,Charles
Name: name, dtype: object
df2 = (df.pivot_table(index='id',
columns='event',
values='reg_num',
aggfunc=','.join,
fill_value='')
.assign(Name=s)
.set_index('Name', append=True))
print (df2)
event Basket Soccer
id Name
1 Andi,Andy 435,436
2 Ben,Benjamin 123 341,231
3 Charlie,Charles 223,115,67 432
Upvotes: 1