Reputation: 299
I'm writing a moving average function on time series:
def datedat_moving_mean(datedat,window):
#window is the average length
datedatF = pandas.DataFrame(datedat)
return (datedatF.rolling(window).mean()).values
The code above is copied from Moving Average- Pandas
The I apply this function to this time series:
datedat1 = numpy.array(
[ pandas.date_range(start=datetime.datetime(2015, 1, 30),periods=17),
numpy.random.rand(17)]).T
However, datedat_moving_mean(datedat1,4)
just return the original datedat1
. It moving averaged nothing! What's wrong?
Upvotes: 0
Views: 4280
Reputation: 942
Your construction of the DataFrame has no index (defaults to ints) and has a column of Timestamp and a column of floats.
I imagine that you want to use the Timestamps as an index, but even if not, you will need to for the purpose of using .rolling() on the frame.
I would suggest that your initialisation of the original DataFrame should be more like this
import pandas as pd
import numpy as np
df = pd.DataFrame(data=np.random.rand(17), index=pd.date_range(start=datetime.datetime(2015, 1, 30),periods=17))
If however you don't, and you are happy to have the dataframe un-indexed, you can work around the rolling issue by temporarily setting the index to the Timestamp column
import pandas as pd
import numpy as np
import datetime
datedat1 = np.array([ pd.date_range(start=datetime.datetime(2015, 1, 30),periods=17),np.random.rand(17)]).T
datedatF = pd.DataFrame(datedat1)
# We can temporarily set the index, compute the rolling mean, and then
# return the values of the entire DataFrame
vals = datedatF.set_index(0).rolling(5).mean().reset_index().values
return vals
I would suggest however that the DataFrame being created with an index will be better (consider what happens in the event that the datetimes are not sorted and you call rolling on the dataframe?)
Upvotes: 1