shri
shri

Reputation: 47

How to count rows between two values of a pandas dataframe?

I have a column that looks like this:

flag
0
1
1
1
1
0
0
1
1
1

I want to count/sum the number of rows between two 0s resulting in the following:

flag Count
0 5
1 5
1 5
1 5
1 5
0 1
0 4
1 4
1 4
1 4

How do I get this? I have tried a solution mentioned in Pandas - Sum values in one column in between values in another This very similar to what I need, but I was not able to modify it to suit my output.

Upvotes: 1

Views: 3104

Answers (1)

Henry Ecker
Henry Ecker

Reputation: 35626

Use cumsum to create groups where the flag is 0. Then groupby tranform size to get the group sizes:

import pandas as pd

df = pd.DataFrame({
    'flag': {0: 0, 1: 1, 2: 1, 3: 1, 4: 1, 5: 0, 6: 0, 7: 1, 8: 1, 9: 1}
})

# Create Group IDs based on where 0s are
groups = df.flag.eq(0).cumsum()

# Groupby groups and transform each group to the size
df['Count'] = df.groupby(groups)['flag'].transform('size')

print(df)

df:

   flag  Count
0     0      5
1     1      5
2     1      5
3     1      5
4     1      5
5     0      1
6     0      4
7     1      4
8     1      4
9     1      4

Upvotes: 2

Related Questions