Reputation: 59
I have df given below.
df
VAR FLAG
A 1
A 1
A 1
B 1
B 1
B 0
B 0
B 1
B 1
B 1
B 0
C 0
C 0
I want to create a new variable named FLAG1 using VAR and FLAG. The final output given below.
Final Output: -
VAR FLAG FLAG1
A 1 1
A 1 1
A 1 1
B 1 1
B 1 1
B 0 2
B 0 3
B 1 4
B 1 4
B 1 4
B 0 5
C 0 1
C 0 2
Upvotes: 1
Views: 35
Reputation: 153510
You can using this bit of logic:
df['FLAG1'] = (df.groupby('VAR')['FLAG']
.transform(lambda x: ((x != x.shift()) | (x == 0)).cumsum()))
Output:
VAR FLAG FLAG1
0 A 1 1
1 A 1 1
2 A 1 1
3 B 1 1
4 B 1 1
5 B 0 2
6 B 0 3
7 B 1 4
8 B 1 4
9 B 1 4
10 B 0 5
11 C 0 1
12 C 0 2
Upvotes: 3
Reputation: 323356
Try something new factorize
pd.Series(zip(df['FLAG'].eq(0).cumsum(),df.FLAG)).groupby(df['VAR']).transform(lambda x : x.factorize()[0]+1)
Out[72]:
0 1
1 1
2 1
3 1
4 1
5 2
6 3
7 4
8 4
9 4
10 5
11 1
12 2
dtype: int32
Upvotes: 1