Reputation: 362
Is there any rolling "any" function in a pandas.DataFrame? Or is there any other way to aggregate boolean values in a rolling function?
Consider:
import pandas as pd
import numpy as np
s = pd.Series([True, True, False, True, False, False, False, True])
# this works but I don't think it is clear enough - I am not
# interested in the sum but a logical or!
s.rolling(2).sum() > 0
# What I would like to have:
s.rolling(2).any()
# AttributeError: 'Rolling' object has no attribute 'any'
s.rolling(2).agg(np.any)
# Same error! AttributeError: 'Rolling' object has no attribute 'any'
So which functions can I use when aggregating booleans? (if numpy.any does not work) The rolling documentation at https://pandas.pydata.org/pandas-docs/version/0.23.4/generated/pandas.DataFrame.rolling.html states that "a Window or Rolling sub-classed for the particular operation" is returned, which doesn't really help.
Upvotes: 6
Views: 3949
Reputation: 765
def function1(ss:pd.Series):
s[ss.index.max()]=any(ss)
return 0
s.rolling(2).apply(function1).pipe(lambda ss:s)
0 True
1 True
2 True
3 True
4 True
5 True
6 False
7 True
Upvotes: 0
Reputation: 2071
You aggregate boolean values like this:
# logical or
s.rolling(2).max().astype(bool)
# logical and
s.rolling(2).min().astype(bool)
To deal with the NaN values from incomplete windows, you can use an appropriate fillna
before the type conversion, or the min_periods
argument of rolling
. Depends on the logic you want to implement.
It is a pity this cannot be done in pandas without creating intermediate values as floats.
Upvotes: 8
Reputation: 863291
This method is not implemented, close, what you need is use Rolling.apply
:
s = s.rolling(2).apply(lambda x: x.any(), raw=False)
print (s)
0 NaN
1 1.0
2 1.0
3 1.0
4 1.0
5 0.0
6 0.0
7 1.0
dtype: float64
s = s.rolling(2).apply(lambda x: x.any(), raw=False).fillna(0).astype(bool)
print (s)
0 False
1 True
2 True
3 True
4 True
5 False
6 False
7 True
dtype: bool
Better here is use strides - generate numpy 2d arrays and processing later:
s = pd.Series([True, True, False, True, False, False, False, True])
def rolling_window(a, window):
shape = a.shape[:-1] + (a.shape[-1] - window + 1, window)
strides = a.strides + (a.strides[-1],)
return np.lib.stride_tricks.as_strided(a, shape=shape, strides=strides)
a = rolling_window(s.to_numpy(), 2)
print (a)
[[ True True]
[ True False]
[False True]
[ True False]
[False False]
[False False]
[False True]]
print (np.any(a, axis=1))
[ True True True True False False True]
Here first NaN
s pandas values are omitted, you can add first values for processing, here False
s:
n = 2
x = np.concatenate([[False] * (n), s])
def rolling_window(a, window):
shape = a.shape[:-1] + (a.shape[-1] - window + 1, window)
strides = a.strides + (a.strides[-1],)
return np.lib.stride_tricks.as_strided(a, shape=shape, strides=strides)
a = rolling_window(x, n)
print (a)
[[False False]
[False True]
[ True True]
[ True False]
[False True]
[ True False]
[False False]
[False False]
[False True]]
print (np.any(a, axis=1))
[False True True True True True False False True]
Upvotes: 3