nous
nous

Reputation: 109

Dropping duplicate values in a column

i have a frame like;

df = pd.DataFrame({'America':["24,23,24,24","10","AA,AA, XY"]})

tried to convert it to a list, set etc.. but coudnt handle

how can i drop the duplicates

Upvotes: 1

Views: 42

Answers (2)

Rakesh
Rakesh

Reputation: 82805

This is one approach using str.split.

Ex:

import pandas as pd

df = pd.DataFrame({'America':["24,23,24,24","10","AA,AA, XY"]})
print(df["America"].str.split(",").apply(set))

Output:

0     {24, 23}
1         {10}
2    {AA,  XY}
Name: America, dtype: object

Upvotes: 1

jezrael
jezrael

Reputation: 863701

Use custom function with split and set:

df['America'] = df['America'].apply(lambda x: set(x.split(',')))

Another solution is use list comprehension:

df['America'] = [set(x.split(',')) for x in df['America']]

print (df)
     America
0   {23, 24}
1       {10}
2  {AA,  XY}

Upvotes: 1

Related Questions