Reputation: 667
I want to create a column C (based on B) which counts each beginning of a series from '100' in B. I have the following pandas data frame:
A B
1 0
2 0
3 100
4 100
5 100
6 0
7 0
8 100
9 100
10 100
11 100
12 0
13 0
14 0
15 100
16 100
I want to create the following column C:
A C
1 0
2 0
3 1
4 1
5 1
6 0
7 0
8 2
9 2
10 2
11 2
12 0
13 0
14 0
15 3
16 3
This column C should count each series of 100s.
Thanks in advance.
Upvotes: 1
Views: 68
Reputation: 863301
Use:
df['C'] = (df['B'].shift(-1).eq(100) & df['B'].ne(100)).cumsum() * df['B'].eq(100)
print (df)
A B C
0 1 0 0
1 2 0 0
2 3 100 1
3 4 100 1
4 5 100 1
5 6 0 0
6 7 0 0
7 8 100 2
8 9 100 2
9 10 100 2
10 11 100 2
11 12 0 0
12 13 0 0
13 14 0 0
14 15 100 3
15 16 100 3
Details and explanation:
Series.shift
by Series.eq
for ==
&
for bitwise AND
with Series.ne
for !=
for Trues for one row before groupsSeries.cumsum
for counter0
by multiple by compared column by Series.eq
df = df.assign(shifted = df['B'].shift(-1).eq(100),
chained = df['B'].shift(-1).eq(100) & df['B'].ne(100),
cumsum = (df['B'].shift(-1).eq(100) & df['B'].ne(100)).cumsum(),
eq_100 = df['B'].eq(100),
C = (df['B'].shift(-1).eq(100) & df['B'].ne(100)).cumsum() * df['B'].eq(100))
print (df)
A B shifted chained cumsum eq_100 C
0 1 0 False False 0 False 0
1 2 0 True True 1 False 0
2 3 100 True False 1 True 1
3 4 100 True False 1 True 1
4 5 100 False False 1 True 1
5 6 0 False False 1 False 0
6 7 0 True True 2 False 0
7 8 100 True False 2 True 2
8 9 100 True False 2 True 2
9 10 100 True False 2 True 2
10 11 100 False False 2 True 2
11 12 0 False False 2 False 0
12 13 0 False False 2 False 0
13 14 0 True True 3 False 0
14 15 100 True False 3 True 3
15 16 100 False False 3 True 3
Upvotes: 2