Slartibartfast
Slartibartfast

Reputation: 1190

Counting True or False

I have the following dataframe:

    True_False
2018-01-02  True
2018-01-03  True
2018-01-04  False
2018-01-05  False
2018-01-08  False
... ...
2020-01-20  True
2020-01-21  True
2020-01-22  True
2020-01-23  True
2020-01-24  False
504 rows × 1 columns

I want to know how many successive True or False but not total it must stop counting after it toggles True or False. As such i want to eventually calculate mean(), max() and min() days. is it possible to show this data in Pandas?

Upvotes: 1

Views: 74

Answers (1)

jezrael
jezrael

Reputation: 862591

Solution if all datetimes are consecutive:

You can create helper Series for consecutive groups by Series.shift and Series.cumsum, then get counts by GroupBy.size:

g = df['True_False'].ne(df['True_False'].shift()).cumsum()
s = df.groupby(['True_False',g]).size()
print (s)
True_False  True_False
False       2             3
            4             1
True        1             2
            3             4
dtype: int64

And last aggregate min, max and mean per first level of MultiIndex:

print (s.groupby(level=0).agg(['mean','max','min']))
            mean  max  min
True_False                
False          2    3    1
True           3    4    2

If datetimes are not consecutive first step is DataFrame.asfreq:

df = df.asfreq('d')
g = df['True_False'].ne(df['True_False'].shift()).cumsum()
s = df.groupby(['True_False',g]).size()
print (s.groupby(level=0).agg(['mean','max','min']))
                mean  max  min
True_False                    
False       1.333333    2    1
True        3.000000    4    2

Upvotes: 1

Related Questions