Reputation: 47
Now I have a dataframe. I want to separate the different values with commons and remove any nulls.
import pandas as pd
import numpy as np
s1 = pd.Series(['a', np.nan,'i'])
s2 = pd.Series(['a','f',np.nan])
s3 = pd.Series(['a', 'e','i'])
s4 = pd.Series(['c', 'g','j'])
df = pd.DataFrame([list(s1), list(s2), list(s3),list(s4)], columns = ['A', 'B','C'])
df
A B C
0 a d NaN
1 a f NaN
2 a e i
3 c g j
Desired outcome:
A B C
0 a d,e,f i
1 c g j
Upvotes: 0
Views: 37
Reputation: 323286
Try with
out = df.groupby('A',as_index=False).agg({'B':','.join,'C':'first'})
A B C
0 a d,f,e i
1 c g j
Update
out = df.groupby('A',as_index=False).agg({'B':lambda x : ','.join(x.dropna().drop_duplicates()),'C':lambda x : ','.join(x.dropna().drop_duplicates())})
out
A B C
0 a d,f,e i
1 c g j
Upvotes: 2