Delete duplicated words in the same row in Pandas

Question

i'm pretty new to Python Pandas and to programming. I have a dataframe that looks something like this (just a simplified example):

   A      B  
0  name1  Dog, Dog, Cat
1  name2  Dog, Bird
2  name3  Cat, Cat, Cat
3  name4  Dog, Cat, Bird

I want to delete the duplicated values on each row, so my DataFrame looks like this:

       A      B  
0  name1  Dog, Cat
1  name2  Dog, Bird
2  name3  Cat
3  name4  Dog, Cat, Bird

I saw that I can do something like that with from collections import OrderedDict, but as I said I'm pretty new to programming, and I have no idea how to do that. It would be great if you could help me, thank you!

Space Impact · Accepted Answer

Use apply and join:

df['B'] = df['B'].apply(lambda x: ', '.join(set(x.split(', '))))

print(df)
       A               B
0  name1        Dog, Cat
1  name2       Dog, Bird
2  name3             Cat
3  name4  Dog, Cat, Bird

Delete duplicated words in the same row in Pandas

Answers (1)

Related Questions