Arjun Chaudhary
Arjun Chaudhary

Reputation: 2453

Allotting unique identifier to a group of groups in pandas dataframe

Given a frame like this

import pandas as pd
df = pd.DataFrame({'A':[1,2,3,4,6,3,7,3,2,11,13,10,1,5],'B':[1,1,1,2,2,2,2,3,3,3,3,3,4,4], 
                   'C':[1,1,1,1,1,1,1,2,2,2,2,2,3,3]})

I want to allot a unique identifier to multiple groups in column B. For example, going from top for every two groups allot a unique identifier as shown in red boxes in below image. The end result would look like below:

enter image description here

Currently I am doing like below but it seems to be over kill. It's taking too much time to update even 70,000 rows:

b_unique_cnt = df['B'].nunique()
the_list = list(range(1, b_unique_cnt+1))
slice_size = 2
list_of_slices = zip(*(iter(the_list),) * slice_size)
counter = 1
df['D'] = -1
for i in list_of_slices:
    df.loc[df['B'].isin(i), 'D'] = counter
    counter = counter + 1

df.head(15)

Upvotes: 0

Views: 19

Answers (1)

BENY
BENY

Reputation: 323366

You could do

df['new'] = df.B.factorize()[0]//2+1
#(df.groupby(['B'], sort=False).ngroup()//2).add(1)

df
Out[153]: 
     A  B  C  new
0    1  1  1    1
1    2  1  1    1
2    3  1  1    1
3    4  2  1    1
4    6  2  1    1
5    3  2  1    1
6    7  2  1    1
7    3  3  2    2
8    2  3  2    2
9   11  3  2    2
10  13  3  2    2
11  10  3  2    2
12   1  4  3    2
13   5  4  3    2

Upvotes: 1

Related Questions