Reputation: 159
when i try to find the rolling median of the following series, i get a list of NaNs.
I used:
b = a[a!=0].rolling(100).median()
a = actual data series (dataframe). Has a bunch of zeros in it that i want to exclude when finding the median
b = rolling median
a[a!=0] gives me the following series.
2017-10-05 NaN
2017-10-06 -0.001074
2017-10-09 -0.001804
2017-10-10 NaN
2017-10-11 NaN
2017-10-12 -0.001687
2017-10-13 NaN
2017-10-16 NaN
2017-10-17 NaN
2017-10-18 NaN
2017-10-19 NaN
2017-10-20 NaN
2017-10-23 -0.003972
2017-10-24 NaN
2017-10-25 -0.004663
2017-10-26 NaN
2017-10-27 NaN
2017-10-30 -0.003192
2017-10-31 NaN
2017-11-01 NaN
2017-11-02 NaN
2017-11-03 NaN
t2017-11-06 NaN
2017-11-07 -0.000189
2017-11-08 NaN
2017-11-09 -0.003762
2017-11-10 -0.000898
2017-11-13 NaN
2017-11-14 -0.002310
the output is just a list of NaNs.
what am i doing wrong? thank you!
Upvotes: 1
Views: 461
Reputation: 38415
Since a is Dataframe and not a Series, if you try to do indexing, you will get NaNs.
Consider this Series
s = pd.Series(np.random.randint(0,10, 20), index = pd.date_range(start = '01/01/2017', periods = 20))
If you slice it, zeros are dropped
s[s!=0]
But for the dataframe, the same code will introduce the NaNs.
df = pd.DataFrame(np.random.randint(0,10, 20), index = pd.date_range(start = '01/01/2017', periods = 20))
You can handle this by specifying the column name while indexing
df[df[0] != 0] #df[0] being the column
Upvotes: 3
Reputation: 363233
Seems like a bug in pandas.
Try this:
a[a!=0].rolling(window=100, center=False, min_periods=1).median()
Upvotes: 2