user3352632
user3352632

Reputation: 667

How to count and mark the occurence of a sequence of a value in a pandas dataframe?

I want to create a column C (based on B) which counts each beginning of a series from '100' in B. I have the following pandas data frame:

A B 
1 0
2 0
3 100
4 100
5 100
6 0
7 0
8 100
9 100
10 100
11 100
12 0
13 0
14 0
15 100
16 100

I want to create the following column C:

A C
1 0
2 0
3 1
4 1
5 1
6 0
7 0
8 2
9 2
10 2
11 2
12 0
13 0
14 0
15 3
16 3

This column C should count each series of 100s.

Thanks in advance.

Upvotes: 1

Views: 68

Answers (1)

jezrael
jezrael

Reputation: 863301

Use:

df['C'] = (df['B'].shift(-1).eq(100) & df['B'].ne(100)).cumsum() * df['B'].eq(100)
print (df)
     A    B  C
0    1    0  0
1    2    0  0
2    3  100  1
3    4  100  1
4    5  100  1
5    6    0  0
6    7    0  0
7    8  100  2
8    9  100  2
9   10  100  2
10  11  100  2
11  12    0  0
12  13    0  0
13  14    0  0
14  15  100  3
15  16  100  3

Details and explanation:

  1. Compare Series.shift by Series.eq for ==
  2. Chain condition by & for bitwise AND with Series.ne for != for Trues for one row before groups
  3. Add Series.cumsum for counter
  4. Replace values to 0 by multiple by compared column by Series.eq

df = df.assign(shifted = df['B'].shift(-1).eq(100),
               chained = df['B'].shift(-1).eq(100) & df['B'].ne(100),
               cumsum = (df['B'].shift(-1).eq(100) & df['B'].ne(100)).cumsum(),
               eq_100 = df['B'].eq(100),
               C = (df['B'].shift(-1).eq(100) & df['B'].ne(100)).cumsum() * df['B'].eq(100))
print (df)
     A    B  shifted  chained  cumsum  eq_100  C
0    1    0    False    False       0   False  0
1    2    0     True     True       1   False  0
2    3  100     True    False       1    True  1
3    4  100     True    False       1    True  1
4    5  100    False    False       1    True  1
5    6    0    False    False       1   False  0
6    7    0     True     True       2   False  0
7    8  100     True    False       2    True  2
8    9  100     True    False       2    True  2
9   10  100     True    False       2    True  2
10  11  100    False    False       2    True  2
11  12    0    False    False       2   False  0
12  13    0    False    False       2   False  0
13  14    0     True     True       3   False  0
14  15  100     True    False       3    True  3
15  16  100    False    False       3    True  3

Upvotes: 2

Related Questions