Reputation: 133
I have data like -
ID col1
1 a
2 b
3 c
4 d
5 c
I wanted to create a binary flag for col1
such that every distinct value is given 1 but repeating values are tagged as 1 only once. Expected output -
ID col1 Flag
1 a 1
2 b 1
3 c 1
4 d 1
5 c 0
Order in which 1 occurs for repeating values does not matter. Please help me in this regard. I am not able to take a start for this.
Upvotes: 1
Views: 866
Reputation: 862741
Use Series.duplicated
with numpy.where
for set 1
for first duplicated and also unique values:
df['Flag'] = np.where(df['col1'].duplicated(), 0, 1)
print (df)
ID col1 Flag
0 1 a 1
1 2 b 1
2 3 c 1
3 4 d 1
4 5 c 0
Another idea with Series.view
and inverted mask by ~
:
df['Flag'] = (~df['col1'].duplicated()).view('i1')
print (df)
ID col1 Flag
0 1 a 1
1 2 b 1
2 3 c 1
3 4 d 1
4 5 c 0
Upvotes: 2