How to merge rows when data when some columns are the same using Pandas Python

Question

Now I have a dataframe. I want to separate the different values with commons and remove any nulls.

import pandas as pd
import numpy as np

s1 = pd.Series(['a', np.nan,'i'])
s2 = pd.Series(['a','f',np.nan])
s3 = pd.Series(['a', 'e','i'])
s4 = pd.Series(['c', 'g','j'])
    
df = pd.DataFrame([list(s1), list(s2), list(s3),list(s4)],  columns =  ['A', 'B','C'])
df


    A   B   C
0   a   d   NaN
1   a   f   NaN
2   a   e   i
3   c   g   j

Desired outcome:

    A   B       C
0   a   d,e,f   i
1   c   g       j

BENY · Accepted Answer

Try with

out = df.groupby('A',as_index=False).agg({'B':','.join,'C':'first'})
   A      B  C
0  a  d,f,e  i
1  c      g  j

Update

out = df.groupby('A',as_index=False).agg({'B':lambda x : ','.join(x.dropna().drop_duplicates()),'C':lambda x : ','.join(x.dropna().drop_duplicates())})
out
   A      B  C
0  a  d,f,e  i
1  c      g  j

How to merge rows when data when some columns are the same using Pandas Python

Answers (1)

Related Questions