Reputation: 11
Names
['abc aa','bdc sc','abc aa','bdc sp','bdc sc','pp sc','bdc sc',]
['lp aa','bd sc','bdc sc','bd sc','lp aa','bd sc']
['nn aa','bb sc','bb sc','nn aa','bd sc']
I tried as
def drop_dupli(text):
#seen = set()
result = []
for item in text.split():
if item not in seen:
seen.add(item)
result. Append(item)
return " ".join(result)
df['newame'] = df['Names'].apply(lambda x: drop_dupli(x))
The result came as follows:
Names
['abc aa','bdc sc','abc ','bdc sp','bdc ','pp sc','bdc ',]
['lp aa','bd sc','bdc sc','bd ','lp ','bd ']
['nn aa','bb sc','bb ','nn ','bd ']
But , I want to get the result should come as follows:
Names
['abc aa','bdc sc','bdc sp','pp sc']
['lp aa','bd sc','bdc sc']
['nn aa','bb sc','bd sc']
Upvotes: 1
Views: 31
Reputation: 862771
Use dict.fromkeys
trick for remove duplicates in original order:
df['newame'] = df['Names'].apply(lambda x: list(dict.fromkeys(x)))
print (df)
Names \
0 [abc aa, bdc sc, abc aa, bdc sp, bdc sc, pp sc...
1 [lp aa, bd sc, bdc sc, bd sc, lp aa, bd sc]
2 [nn aa, bb sc, bb sc, nn aa, bd sc]
newame
0 [abc aa, bdc sc, bdc sp, pp sc]
1 [lp aa, bd sc, bdc sc]
2 [nn aa, bb sc, bd sc]
because if use set
s order is changed:
df['newame'] = df['Names'].apply(lambda x: list(set(x)))
print (df)
Names \
0 [abc aa, bdc sc, abc aa, bdc sp, bdc sc, pp sc...
1 [lp aa, bd sc, bdc sc, bd sc, lp aa, bd sc]
2 [nn aa, bb sc, bb sc, nn aa, bd sc]
newame
0 [pp sc, bdc sp, bdc sc, abc aa]
1 [lp aa, bd sc, bdc sc]
2 [bb sc, nn aa, bd sc]
Upvotes: 1