Reputation: 2453
Given a frame like this
import pandas as pd
df = pd.DataFrame({'A':[1,2,3,4,6,3,7,3,2,11,13,10,1,5],'B':[1,1,1,2,2,2,2,3,3,3,3,3,4,4],
'C':[1,1,1,1,1,1,1,2,2,2,2,2,3,3]})
I want to allot a unique identifier to multiple groups in column B. For example, going from top for every two groups allot a unique identifier as shown in red boxes in below image. The end result would look like below:
Currently I am doing like below but it seems to be over kill. It's taking too much time to update even 70,000 rows:
b_unique_cnt = df['B'].nunique()
the_list = list(range(1, b_unique_cnt+1))
slice_size = 2
list_of_slices = zip(*(iter(the_list),) * slice_size)
counter = 1
df['D'] = -1
for i in list_of_slices:
df.loc[df['B'].isin(i), 'D'] = counter
counter = counter + 1
df.head(15)
Upvotes: 0
Views: 19
Reputation: 323366
You could do
df['new'] = df.B.factorize()[0]//2+1
#(df.groupby(['B'], sort=False).ngroup()//2).add(1)
df
Out[153]:
A B C new
0 1 1 1 1
1 2 1 1 1
2 3 1 1 1
3 4 2 1 1
4 6 2 1 1
5 3 2 1 1
6 7 2 1 1
7 3 3 2 2
8 2 3 2 2
9 11 3 2 2
10 13 3 2 2
11 10 3 2 2
12 1 4 3 2
13 5 4 3 2
Upvotes: 1