Jayne How
Jayne How

Reputation: 143

Is there a way to collapse a few columns based on an index, while retaining the other columns?

I have a dataframe that looks like that :

ID Var1 Var2 Var3
1 row1 abc 321
1 row2 bcd 456
2 row4 def 555
2 row5 tss 889

And what I want to do is to join all the strings by a comma in Var1 according to ID, but retain all the other columns as is. So the last one should look like this:

ID Var1 Var2 Var3
1 row1, row2 abc 321
1 row1, row2 bcd 456
2 row4, row5 def 555
2 row4, row5 tss 889

I have tried aggregating, but if i do that, Var2 and Var3 is dropped. I have a few columns whose rows I want to join like I did with Var 1, is there a way to do this?

Thanks for the help !

Upvotes: 0

Views: 57

Answers (1)

Anurag Dabas
Anurag Dabas

Reputation: 24324

use groupby()+transform():

if order matters:

df['Var1']=df.groupby('ID')['Var1'].transform(lambda x:','.join(pd.unique(x)))

if order doesn't matters:

df['Var1']=df.groupby('ID')['Var1'].transform(lambda x:','.join(set(x)))

output of df:

    ID  Var1        Var2    Var3
0   1   row1,row2   abc     321
1   1   row1,row2   bcd     456
2   2   row4,row5   def     555
3   2   row4,row5   tss     889

Upvotes: 1

Related Questions