Reputation: 143
I have a dataframe that looks like that :
ID | Var1 | Var2 | Var3 |
---|---|---|---|
1 | row1 | abc | 321 |
1 | row2 | bcd | 456 |
2 | row4 | def | 555 |
2 | row5 | tss | 889 |
And what I want to do is to join all the strings by a comma in Var1 according to ID, but retain all the other columns as is. So the last one should look like this:
ID | Var1 | Var2 | Var3 |
---|---|---|---|
1 | row1, row2 | abc | 321 |
1 | row1, row2 | bcd | 456 |
2 | row4, row5 | def | 555 |
2 | row4, row5 | tss | 889 |
I have tried aggregating, but if i do that, Var2 and Var3 is dropped. I have a few columns whose rows I want to join like I did with Var 1, is there a way to do this?
Thanks for the help !
Upvotes: 0
Views: 57
Reputation: 24324
use groupby()
+transform()
:
if order matters:
df['Var1']=df.groupby('ID')['Var1'].transform(lambda x:','.join(pd.unique(x)))
if order doesn't matters:
df['Var1']=df.groupby('ID')['Var1'].transform(lambda x:','.join(set(x)))
output of df
:
ID Var1 Var2 Var3
0 1 row1,row2 abc 321
1 1 row1,row2 bcd 456
2 2 row4,row5 def 555
3 2 row4,row5 tss 889
Upvotes: 1