Mohammad Amir
Mohammad Amir

Reputation: 133

Creating a binary flag for a column

I have data like -

ID    col1
1      a
2      b
3      c
4      d
5      c

I wanted to create a binary flag for col1 such that every distinct value is given 1 but repeating values are tagged as 1 only once. Expected output -

ID    col1    Flag
1      a       1 
2      b       1
3      c       1
4      d       1 
5      c       0

Order in which 1 occurs for repeating values does not matter. Please help me in this regard. I am not able to take a start for this.

Upvotes: 1

Views: 866

Answers (1)

jezrael
jezrael

Reputation: 862741

Use Series.duplicated with numpy.where for set 1 for first duplicated and also unique values:

df['Flag'] = np.where(df['col1'].duplicated(), 0, 1)
print (df)
   ID col1  Flag
0   1    a     1
1   2    b     1
2   3    c     1
3   4    d     1
4   5    c     0

Another idea with Series.view and inverted mask by ~:

df['Flag'] = (~df['col1'].duplicated()).view('i1')
print (df)
   ID col1  Flag
0   1    a     1
1   2    b     1
2   3    c     1
3   4    d     1
4   5    c     0

Upvotes: 2

Related Questions