junyu
junyu

Reputation: 89

Condition is true to start counting, until the next row is true to restart counting

expected result table

    bool    count
0   FALSE   
1   FALSE   
2   TRUE    0
3   FALSE   1
4   FALSE   2
5   FALSE   3
6   TRUE    0
7   FALSE   1
8   TRUE    0
9   TRUE    0

How to calculate the value of column 'count'

Upvotes: 2

Views: 494

Answers (6)

mozway
mozway

Reputation: 261675

Use a GroupBy.cumcount and mask with where:

g = df['bool'].cumsum()
df['count'] = df['bool'].groupby(g).cumcount().where(g.gt(0))

Alternative:

g = df['bool'].cumsum()
df['count'] = (df['bool'].groupby(g).cumcount()
              .where(df['bool'].cummax())
              )

Output:

    bool  count
0  False    NaN
1  False    NaN
2   True    0.0
3  False    1.0
4  False    2.0
5   True    0.0
6   True    0.0
7  False    1.0
8  False    2.0
9  False    3.0

Upvotes: 1

user16836078
user16836078

Reputation:

If you want to obtain the count not in pandas way, you can try this.

result = []
count = np.nan
for i in df['bool']:
    if i == True:
        count = 0
        result.append(count)
    if i == False:
        count += 1
        result.append(count)
    elif i == False:
        result.append(np.nan)
        

result
Out[4]: [nan, nan, 0, 1, 2, 3, 0, 1, 0, 0]

df['count'] = result

Upvotes: 1

Stryder
Stryder

Reputation: 880

Here you go:

# create bool dataframe
df = pd.DataFrame(dict(bool_= [0, 0, 1, 0, 0, 1, 1, 0, 0, 0]), dtype= bool)
df.index = list("abcdefghij")

# create a new Series unique integers to associate a group for the rows
# between True values
ix = pd.Series(range(df.shape[0])).where(df.bool_.values, np.nan).ffill().values

# if the first rows are False, they will be NaNs and shouldn't be 
# counted so only perform groupby and cumcount() for what is notna
notna = pd.notna(ix)
df["count"] = df[notna].groupby(ix[notna]).cumcount()

>>> df   
   bool_  count
a  False    NaN
b  False    NaN
c   True    0.0
d  False    1.0
e  False    2.0
f   True    0.0
g   True    0.0
h  False    1.0
i  False    2.0
j  False    3.0

Upvotes: 2

jakelime
jakelime

Reputation: 124

I think your question is not clear. We need a little more context and objectives to work with here.

Let's assume that you have a dataframe of Boolean values [True, False], and you wish to compute a count of how many "True" and how many "False"

import pandas as pd
import random

## Randomly generating Boolean values to populate a dataframe
choices = [ 'True', 'False' ]
df = pd.DataFrame(index = range(10), columns = ['boolean'])
df['boolean'] = df['boolean'].apply(lambda x: random.choice(choices))

Randomly generated data

  boolean
0   False
1   False
2   False
3    True
4   False
5   False
6   False
7    True
8   False
9   False
## Reporting the count of True and False values
results = df.groupby('boolean').size()
print(results)

Results

boolean
False    8
True     2

Upvotes: 1

Ynjxsjmh
Ynjxsjmh

Reputation: 30050

You can try groupby the cumsum of bool column then transform a customize function to check if first element in each group is True

df['m'] = df['bool'].cumsum()
df['out'] = (df.groupby(df['bool'].cumsum())
             ['bool'].transform(lambda col: range(len(col)) if col.iloc[0] else [pd.NA]*len(col)))
print(df)

    bool  count  m   out
0  False    NaN  0  <NA>
1  False    NaN  0  <NA>
2   True    0.0  1     0
3  False    1.0  1     1
4  False    2.0  1     2
5  False    3.0  1     3
6   True    0.0  2     0
7  False    1.0  2     1
8   True    0.0  3     0
9   True    0.0  4     0

Upvotes: 1

pythonheadache
pythonheadache

Reputation: 106

If you mean the sum of all the elements in count then you can do it this way:

Count_Total = df['count'].sum()

Upvotes: 0

Related Questions