Reputation: 2296
I have dataframe like this:
item tags
1 awesome, awesome, great
2 cool, fun
3 boring, boring, average
4 ok, expensive
How can I remove the duplicate tags to get:
item tags
1 awesome, great
2 cool, fun
3 boring, average
4 ok, expensive
Upvotes: 0
Views: 64
Reputation: 153460
If I understand correctly, try:
df['new_tags'] = df['tags'].apply(lambda x: ', '.join(set(x.split(', '))))
Output:
item tags new_tags
0 1 awesome, awesome, great awesome, great
1 2 cool, fun cool, fun
2 3 boring, boring, average average, boring
3 4 ok, expensive expensive, ok
Upvotes: 0
Reputation: 25239
Use listcomp, str.split
, pd.unique
and join
df['unique_tags'] = [', '.join(pd.unique(x)) for x in df.tags.str.split(', ')]
Out[145]:
item tags unique_tags
0 1 awesome, awesome, great awesome, great
1 2 cool, fun cool, fun
2 3 boring, boring, average boring, average
3 4 ok, expensive ok, expensive
Upvotes: 1