Reputation: 21
What are the algorithms that will predict duplicates in your dataset. For example -
Name Marks
A 100
B 90
C 80
A 100
I need something like this -
Name Marks S/D
A 100 Single
B 90 Single
C 80 Single
A 100 Duplicate
I'm looking for some algorithms that can help in this case.
Upvotes: 2
Views: 125
Reputation: 24049
IIUC, you need this:
import pandas as pd
df = pd.DataFrame({'Name':['A','B','C','A'],'Marks': [100, 90, 80, 100]})
df['res'] = df.duplicated().map({False:"Single", True:"Duplicated"})
Output:
>>> df
Name Marks res
0 A 100 Single
1 B 90 Single
2 C 80 Single
3 A 100 Duplicated
Upvotes: 1