behappy
behappy

Reputation: 35

how to combine multiple columns and multiple rows based on group by of another column

I am trying to combine multiple columns and rows into single column based on group by field of ID column. The input is

|Id |   Sample_id | Sample_name |   Sample_number|
|:--|:------------|:-----------:|---------------:|             
|1  |  123        | Abcdef|ghij |  1234567       |
|1  |   345       | Vbnhj|tt|t  |   45678        |
|1  |   456       | Ffff|yyy|yy |   789000       |

Expected output : columns,rows belonging to same Id should be combined and form a list like below.

Id  Sample_details              
    123,Abcdef|ghij,1234567
 1  345,Vbnhj|tt|t, 45678       
    456 ,Ffff|yyy|yy,789000

 2   536 ,Ftff|uyy|iy,79000
     453, hnhj|tdd|rr, 67678 
   

I tried below which is not working

df.groupby('Id')['Sample_id']['Sample_name']['Sample_number'].apply(','.join).reset_index()

Upvotes: 0

Views: 291

Answers (1)

Corralien
Corralien

Reputation: 120391

Try:

df['Sample_details'] = df.filter(like='Sample_').astype(str).apply(','.join, axis=1)

out = df.groupby('Id')['Sample_details'].apply('\n'.join).reset_index()

Output:

>>> out
   Id                                     Sample_details
0   1  123,Abcdef|ghij,1234567\n345,Vbnhj|tt|t,45678\...

Note: Pandas does not interpret escape sequence.

Upvotes: 1

Related Questions