Reputation: 47
I have a column that looks like this:
flag |
---|
0 |
1 |
1 |
1 |
1 |
0 |
0 |
1 |
1 |
1 |
I want to count/sum the number of rows between two 0s resulting in the following:
flag | Count |
---|---|
0 | 5 |
1 | 5 |
1 | 5 |
1 | 5 |
1 | 5 |
0 | 1 |
0 | 4 |
1 | 4 |
1 | 4 |
1 | 4 |
How do I get this? I have tried a solution mentioned in Pandas - Sum values in one column in between values in another This very similar to what I need, but I was not able to modify it to suit my output.
Upvotes: 1
Views: 3104
Reputation: 35626
Use cumsum to create groups where the flag is 0. Then groupby tranform size to get the group sizes:
import pandas as pd
df = pd.DataFrame({
'flag': {0: 0, 1: 1, 2: 1, 3: 1, 4: 1, 5: 0, 6: 0, 7: 1, 8: 1, 9: 1}
})
# Create Group IDs based on where 0s are
groups = df.flag.eq(0).cumsum()
# Groupby groups and transform each group to the size
df['Count'] = df.groupby(groups)['flag'].transform('size')
print(df)
df
:
flag Count
0 0 5
1 1 5
2 1 5
3 1 5
4 1 5
5 0 1
6 0 4
7 1 4
8 1 4
9 1 4
Upvotes: 2