Reputation: 5559
I create a pandas dataframe as
df = pd.DataFrame(data=[[1],[2],[3],[1],[2],[3],[1],[2],[3]])
df
Out[19]:
0
0 1
1 2
2 3
3 1
4 2
5 3
6 1
7 2
8 3
I calculate the 75% percentile on windows of length =3
df.rolling(window=3,center=False).quantile(0.75)
Out[20]:
0
0 NaN
1 NaN
2 2.0
3 2.0
4 2.0
5 2.0
6 2.0
7 2.0
8 2.0
then just to check I calculate the 75% on the first window separately
df.iloc[0:3].quantile(0.75)
Out[22]:
0 2.5
Name: 0.75, dtype: float64
why I get a different value?
Upvotes: 2
Views: 8401
Reputation: 402493
This is a bug, referenced in GH9413 and GH16211.
The reason, as given by the devs -
It looks like the difference here is that
quantile
andpercentile
take the weighted average of the nearest points, whereas rolling_quantile simply uses one the nearest point (no averaging).
Rolling.quantile
did not interpolate when computing the quantiles.
The bug has been fixed as of 0.21.
For older versions, the fix is using a rolling_apply
.
df.rolling(window=3, center=False).apply(lambda x: pd.Series(x).quantile(0.75))
0
0 NaN
1 NaN
2 2.5
3 2.5
4 2.5
5 2.5
6 2.5
7 2.5
8 2.5
Upvotes: 7