Reputation: 1430
I'm trying to get the rolling mean of a column in a Pandas dataframe, and I can't force the mean to skip the NAs. I have the same issue with the rolling sum:
import pandas as pd
test = pd.DataFrame({'A': [1, 2, 3, 10, np.nan, 5 , 20, 6, 7, np.nan, np.nan, np.nan]})
test['rolling_mean'] = test.A.rolling(3, min_periods=3).mean(skipna = True).reset_index(0,drop=True)
test['rolling_sum'] = test.A.rolling(3, min_periods=3).sum(skipna = True).reset_index(0,drop=True)
test
A rolling_mean rolling_sum rolling_mean_desired rolling_sum_desired
0 1 NaN NaN NaN NaN
1 2 NaN NaN NaN NaN
2 3 2 6 2 6
3 10 5 15 5 15
4 NaN NaN NaN 6.5 13
5 5 NaN NaN 7.5 15
6 20 NaN NaN 12.5 25
7 6 10.333333 31 10.33333333 31
8 7 11 33 11 33
9 Nan NaN NaN 6.5 13
10 Nan NaN NaN 7 7
11 Nan NaN NaN NaN NaN
In the example above, I do want the first two entries to be NaN (that's why I defined the min_period the same as my window. But rows 4,5,6,9,10 shouldn't be NAs. Row 11 should be NaN as all three rows are NaNs.
Upvotes: 0
Views: 472
Reputation:
As the documentation in the Pandas website said, the min_periods
is the minimum number of observations in window required to have a value. So what you need to do is just to make the min_periods as 1, not 3.
test.A.rolling(3,min_periods=1).mean()
Out[120]:
0 1.000000
1 1.500000
2 2.000000
3 5.000000
4 6.500000
5 7.500000
6 12.500000
7 10.333333
8 11.000000
9 6.500000
10 7.000000
11 NaN
Name: A, dtype: float64
test.A.rolling(3,min_periods=1).sum()
Out[122]:
0 1.0
1 3.0
2 6.0
3 15.0
4 13.0
5 15.0
6 25.0
7 31.0
8 33.0
9 13.0
10 7.0
11 NaN
Name: A, dtype: float64
And if you want to change the first two row to become NaN, you can manually do it after performing the rolling operation.
test.loc[0:1, ['rolling_mean','rolling_sum']] = np.nan
test
Out[142]:
A rolling_mean rolling_sum
0 1.0 NaN NaN
1 2.0 NaN NaN
2 3.0 2.000000 6.0
3 10.0 5.000000 15.0
4 NaN 6.500000 13.0
5 5.0 7.500000 15.0
6 20.0 12.500000 25.0
7 6.0 10.333333 31.0
8 7.0 11.000000 33.0
9 NaN 6.500000 13.0
10 NaN 7.000000 7.0
11 NaN NaN NaN
Upvotes: 2