Reputation: 3392
I want to apply a rolling window function to y_train
DataFrame:
y_train
is a single column:
0
0
1
..
2
0
3
0
Unique values in y_train
:
np.unique(y_train.values)
> array([0, 1, 2, 3])
When I apply this code, I get float values in y_train
:
window = 20
y_train = y_train.rolling(window).median().dropna()
New unique values in y_train
:
np.unique(y_train.values)
> array([0. , 0.5, 1. , 1.5, 2. , 2.5, 3. ])
How can I apply rolling window function in order to get the most FREQUENT value per each window
batch instead of median?
Upvotes: 2
Views: 370
Reputation: 221624
We could use scipy.stats.mode
alongwith apply()
-
In [57]: a
Out[57]:
0 2
1 3
2 2
3 2
4 7
5 3
6 2
7 4
8 6
9 3
dtype: int64
In [58]: from scipy import stats
In [59]: modeval = lambda x : mode(x)[0]
In [60]: a.rolling(window=5).apply(modeval).dropna()
Out[60]:
4 2.0
5 2.0
6 2.0
7 2.0
8 2.0
9 3.0
dtype: float64
Upvotes: 1