Possible bug in pandas rolling mean when window = 1

Question

In order to have a more generic notation in my code, I want to express my original time series as a moving average over 1 period. Quite unexpectedly, using pandas pd.rolling_mean function, the two are not exactly the same:

import pandas as pd
import numpy as np

np.random.seed(1)

ts = pd.Series(np.random.rand(1000))

mavg = pd.rolling_mean(ts, 1)

(ts - mavg).describe()
Out[120]: 
count    1.000000e+03
mean     6.284973e-16
std      3.877250e-16
min     -3.330669e-16
25%      3.330669e-16
50%      5.551115e-16
75%      8.881784e-16
max      1.554312e-15
dtype: float64

any((ts - mavg).dropna()>0)
Out[121]: True

Should this be considered a bug or am I missing something?

Mike M&#252;ller · Accepted Answer

The numbers are very small and well in the range of numerical "noise" caused by how floats work. Floats cannot represent all numbers exactly. Therefore you will often have small "residuals" left when doing calculations with floats. Check against a small epsilon:

>>> any((ts - mavg).dropna().abs() > 1e-14)
False

Possible bug in pandas rolling mean when window = 1

Answers (2)

Related Questions