helloworldlevel
helloworldlevel

Reputation: 159

median of a rolling window, excluding zeros

when i try to find the rolling median of the following series, i get a list of NaNs.

I used:

b = a[a!=0].rolling(100).median()

a = actual data series (dataframe). Has a bunch of zeros in it that i want to exclude when finding the median

b = rolling median

a[a!=0] gives me the following series.

2017-10-05 NaN 2017-10-06 -0.001074 2017-10-09 -0.001804 2017-10-10 NaN 2017-10-11 NaN 2017-10-12 -0.001687 2017-10-13 NaN 2017-10-16 NaN 2017-10-17 NaN 2017-10-18 NaN 2017-10-19 NaN 2017-10-20 NaN 2017-10-23 -0.003972 2017-10-24 NaN 2017-10-25 -0.004663 2017-10-26 NaN 2017-10-27 NaN 2017-10-30 -0.003192 2017-10-31 NaN 2017-11-01 NaN 2017-11-02 NaN 2017-11-03 NaN t2017-11-06 NaN 2017-11-07 -0.000189 2017-11-08 NaN 2017-11-09 -0.003762 2017-11-10 -0.000898 2017-11-13 NaN 2017-11-14 -0.002310

the output is just a list of NaNs.

what am i doing wrong? thank you!

Upvotes: 1

Views: 461

Answers (2)

Vaishali
Vaishali

Reputation: 38415

Since a is Dataframe and not a Series, if you try to do indexing, you will get NaNs.

Consider this Series

s = pd.Series(np.random.randint(0,10, 20), index = pd.date_range(start = '01/01/2017', periods = 20))

If you slice it, zeros are dropped

s[s!=0]

But for the dataframe, the same code will introduce the NaNs.

df = pd.DataFrame(np.random.randint(0,10, 20), index = pd.date_range(start = '01/01/2017', periods = 20))

You can handle this by specifying the column name while indexing

df[df[0] != 0] #df[0] being the column

Upvotes: 3

wim
wim

Reputation: 363233

Seems like a bug in pandas.

Try this:

a[a!=0].rolling(window=100, center=False, min_periods=1).median()

Upvotes: 2

Related Questions