Reputation: 61
I Have e DataFrame like this:
RTD Val
BA 2
BA 88
BA 15
BA 67
BA 83
BA 77
BA 79
BA 90
BA 1
BA 14
First:
df['count'] = df.Val > 15
print(df)
I get as a result:
RTD Val count
0 BA 2 False
1 BA 88 True
2 BA 15 False
3 BA 67 True
4 BA 83 True
5 BA 77 True
6 BA 79 True
7 BA 90 True
8 BA 1 False
9 BA 14 False
Now, to count the maximum consecutive occurrences I use:
def rolling_count(val):
if val == rolling_count.previous:
rolling_count.count +=1
else:
rolling_count.previous = val
rolling_count.count = 1
return rolling_count.count
rolling_count.count = 0 #static variable
rolling_count.previous = None #static variable
ddf= df['count'].apply(rolling_count)
print (max(ddf))
I get as result: 5
.
My answer is: To count the max occurrences consecutive of False, how i should do?
The correct value is equal to 2
.
I am interested to know the maximum of consecutive occurrences other than True, for Val > 15
and conversely
Upvotes: 2
Views: 1995
Reputation: 38500
Here is a longer method that coerces count to be an integer rather then boolean by adding 0. The absolute difference indicates changes in the boolean value, and the first value is filled to be 1.
The result of this change Series is evaluated as to whether elements are greater than 0 in the 'bools' variable and the corresponding elements from df['count'] are extracted.
The results of the change vector are used with cumsum
to form IDs which are used in groupby
in the runs variable. Counts of each ID are then contstructed in the runs variable.
countDf = DataFrame({'bools': list(df['count'][(df['count'] + 0)
.diff().abs().fillna(1) > 0]),
'runs': list(df['Val'].groupby((df['count'] + 0)
.diff().abs().fillna(1).cumsum()).count())})
countDf
bools runs
0 False 1
1 True 1
2 False 1
3 True 5
4 False 2
You can extract the maximum runs using standard subsetting like
countDf[countDf.bools == False]['runs'].max()
2
countDf[countDf.bools == True]['runs'].max()
5
Upvotes: 2
Reputation: 294218
This is my attempt
gt15 = df.Val.gt(15)
counts = df.groupby([gt15, (gt15 != gt15.shift()) \
.cumsum()]).size().rename_axis(['>15', 'grp'])
counts
>15 grp
False 1 1
3 1
5 2
True 2 1
4 5
dtype: int64
counts.loc[False].max()
2
Upvotes: 1