Reputation: 109
i have a frame like;
df = pd.DataFrame({'America':["24,23,24,24","10","AA,AA, XY"]})
tried to convert it to a list, set etc.. but coudnt handle
how can i drop the duplicates
Upvotes: 1
Views: 42
Reputation: 82805
This is one approach using str.split
.
Ex:
import pandas as pd
df = pd.DataFrame({'America':["24,23,24,24","10","AA,AA, XY"]})
print(df["America"].str.split(",").apply(set))
Output:
0 {24, 23}
1 {10}
2 {AA, XY}
Name: America, dtype: object
Upvotes: 1
Reputation: 863701
Use custom function with split
and set
:
df['America'] = df['America'].apply(lambda x: set(x.split(',')))
Another solution is use list comprehension:
df['America'] = [set(x.split(',')) for x in df['America']]
print (df)
America
0 {23, 24}
1 {10}
2 {AA, XY}
Upvotes: 1