Reputation: 11
I have a pandas data frame like this:
Col1 Col2
0 a Jack
1 a Jill
2 b Bob
3 c Cain
4 c Sam
5 a Adam
6 b Abel
What I want to do now is combine values in column 2 for each value in column 1, ie, output should be like this:
Col1 Col2
0 a Jack, Jill, Adam
1 b Bob, Abel
2 c Cain, Sam
How can I best approach this problem? Any advice would be helpful. Thanks in advance!
Upvotes: 1
Views: 718
Reputation: 246
Here is a different approach, try it out:
df.groupby("Col1").agg(lambda x: ', '.join(x.unique())).reset_index()
Col1 Col2
0 a Jack, Jill, Adam
1 b Bob, Abel
2 c Cain, Sam
Something to keep in mind. If your dataset was something like this:
Col1 Col2
0 a Jack
1 a Jill
2 b Bob
3 c Cain
4 c Sam
5 a Adam
6 b Abel
7 a Adam
8 c Sam
You would get the following output:
df.groupby("Col1").agg(lambda x: ', '.join(x)).reset_index()
Col1 Col2
0 a Jack, Jill, Adam, Adam
1 b Bob, Abel
2 c Cain, Sam, Sam
So by using unique
you remove duplicates in Col2
.
Hope that helps
Upvotes: 0
Reputation: 42946
Use
df = df.groupby('Col1')['Col2'].apply(', '.join)
print(df)
Col1
a Jack, Jill, Adam
b Bob, Abel
c Cain, Sam
Name: Col2, dtype: object
Use reset_index
to get Col1
back as column instead of index
df = df.groupby('Col1')['Col2'].apply(', '.join).reset_index()
print(df)
Col1 Col2
0 a Jack, Jill, Adam
1 b Bob, Abel
2 c Cain, Sam
Upvotes: 2
Reputation: 18218
You can also try following as in other similar answer:
new_df = df.groupby('Col1', as_index=False).agg(', '.join)
Upvotes: 0