Reputation: 185
I have a dataframe:
cid si
A 1
A 0
A 1
A 0
A 1
A 0
A 0
A 0
A 0
A 0
A 0
A 0
A 0
A 0
B 1
B 0
B 0
B 0
B 0
B 0
B 0
I need to have another column with named ide which should add the same value until next 1 in si is encountered and value in cid remains same. If value in cid gets changed the identifier starts over. Adding sample output.
cid si ide
A 1 aa
A 0 aa
A 1 bb
A 0 bb
A 1 cc
A 0 cc
A 0 cc
A 0 cc
A 0 cc
A 0 cc
A 0 cc
A 0 cc
A 0 cc
A 0 cc
B 1 aa
B 0 aa
B 0 aa
B 0 aa
B 0 aa
B 0 aa
B 0 aa
Upvotes: 1
Views: 61
Reputation: 51185
First, define a mapping dictionary that maps 1...n with your desired filler values; here is a small example:
dct = {1: 'aa', 2: 'bb', 3: 'cc'}
Then use groupby
, cumsum
and map
:
df.groupby('cid').si.cumsum().map(dct)
0 aa
1 aa
2 bb
3 bb
4 cc
5 cc
6 cc
7 cc
8 cc
9 cc
10 cc
11 cc
12 cc
13 cc
14 aa
15 aa
16 aa
17 aa
18 aa
19 aa
20 aa
Name: si, dtype: object
The important part here is cumsum
, as that is how we know that a new group is starting:
df.groupby('cid').si.cumsum()
0 1
1 1
2 2
3 2
4 3
5 3
6 3
7 3
8 3
9 3
10 3
11 3
12 3
13 3
14 1
15 1
16 1
17 1
18 1
19 1
20 1
Name: si, dtype: int64
Upvotes: 1