Reputation: 286
I am trying to count how many times in a row a condition is happening.
I read about groupby.cumcount()
but it doesn't really work the way I would like to.
Here is a small part of the data:
min max
time
1970-01-02 -3.440000 -1.180000
1970-01-03 -4.830000 -0.700000
1970-01-04 -6.250000 0.250000
1970-01-05 -11.700000 -6.690000
1970-01-06 -13.000000 -3.720000
1970-01-07 -3.870000 2.070000
1970-01-08 0.320000 2.690000
1970-01-09 -5.170000 2.310000
1970-01-10 -4.540000 1.140000
1970-01-11 -7.260000 1.300000
1970-01-12 -9.870000 -0.780000
1970-01-13 -6.520000 -0.390000
1970-01-14 -8.490000 -5.090000
1970-01-15 -13.670000 -8.670000
1970-01-16 -11.080000 -4.110000
1970-01-17 -24.770000 -7.320000
1970-01-18 -29.709999 -24.230000
1970-01-19 -24.200001 -19.480000
1970-01-20 -31.000000 -13.810000
1970-01-21 -36.389999 -30.209999
1970-01-22 -39.889999 -36.990002
1970-01-23 -41.750000 -38.730000
1970-01-24 -38.259998 -8.510000
1970-01-25 -14.100000 -5.740000
1970-01-26 -12.000000 -8.540000
1970-01-27 -12.060000 -7.470000
1970-01-28 -10.230000 -7.710000
1970-01-29 -10.850000 -8.400000
1970-01-30 -15.270000 -9.870000
1970-01-31 -11.920000 -5.290000
Considering a condition : df['min'] <= -30
and a period superior or egal to 3 days.
I would like to know how many times do we have at least three consecutive days with a 'min' under -30 in year.
So the result would be something like (dummy values) :
occurences
time
1970 3
1971 4
1972 2
1973 3
I have toyed with a few solutions but I can't get close, any suggestion?
Upvotes: 1
Views: 1686
Reputation: 210982
IIUC:
In [94]: x = df[df.rolling(3)['min'].max() <= -30]
In [95]: x.groupby(x.index.year)['min'].count().to_frame('occurences')
Out[95]:
occurences
1970 3
Upvotes: 4