Reputation:
I have df which currently looks something like this:
Car Name Number
Adam Leaf 9
Adamm Leaf 9
Adam Lea NaN
Adam-Leaf NaN
Adam/Leaf 9
Claire-Green NaN
Cliare Green 3
Claire Green 3
Claire Gren NaN
Claire/Green 3
I am trying to remove the variations to achieve something like this
Car Name Number
Adam Leaf 9
Claire Green 3
Upvotes: 1
Views: 160
Reputation: 323226
here is one way from jellyfish
import jellyfish
s=df.groupby(df['Car Name'].apply(jellyfish.soundex)).first()
Car Name Number
Car Name
A354 Adam Leaf 9.0
C462 Claire-Green 3.0
Upvotes: 3