Santosh Kadge
Santosh Kadge

Reputation: 83

pandas group by and convert rows into multiple columns

data = {'groupId':[1,1,2], 'email':['[email protected]', '[email protected]', '[email protected]'],
        'type':['office','personal','personal'],'name':['santy','santy','will']} 
df = pd.DataFrame(data) 

I have a dataframe like this

groupId email   type           name
1   [email protected]    office      santy
1   [email protected]    personal    santy
2   [email protected]    personal    will

I want to transform rows into dynamic columns depending on the number of rows in in particular group

groupId email1         type1   email2          type2       name
1      [email protected]  office    [email protected]    personal    santy
2      [email protected]   personal   na              na        will

I know I can use set_index with unstack but confused on how I can give the column names and create those many columns as in particular group.

Is there efficient way do do this? Any help would be appreciated

Upvotes: 1

Views: 1333

Answers (1)

Quang Hoang
Quang Hoang

Reputation: 150765

You can do:

new_df = (df.assign(col=df.groupby('groupId').cumcount()+1)
   .set_index(['groupId','col'])
   .unstack('col')
   .sort_index(level=(1,0), axis=1)
)

new_df.columns = [f'{x}{y}' for x,y in new_df.columns]

Output:

               email1     type1        email2     type2
groupId                                                
1        [email protected]    office  [email protected]  personal
2        [email protected]  personal           NaN       NaN

Upvotes: 3

Related Questions