Reputation:
I have some event times in a list and I would like to plot an exponentially weighted moving average of them. I can do this using the following code.
import numpy as np
import matplotlib.pyplot as plt
print "Code runnning"
a=0.01
l = [3.0,7.0,10.0,20.0,200.0]
y = np.zeros(1000)
for item in l:
y[item]=1
s = np.zeros(1000)
x = np.linspace(0,1000,1000)
for i in xrange(1000):
s[i] = a*y[i-1]+(1-a)*s[i-1]
plt.plot(x, s)
plt.show()
This is clearly a horrible way to use python however. What's the right way to do this? Is it possible to do it without making all these extra sparse arrays?
The output should look like this.
Upvotes: 2
Views: 307
Reputation: 40973
Pandas comes to mind for this task:
import pandas as pd
l = [3.0,7.0,10.0,20.0,200.0]
s = pd.Series(np.ones_like(l), index=l)
y = s.reindex(range(1000), fill_value=0)
pd.ewma(y, 199).plot()
The period 199 is related to your parameter alpha 0.01 as n=2/(a+1)
. Result:
Upvotes: 1
Reputation: 1298
I think you're looking for something like this:
import numpy as np
import matplotlib.pyplot as plt
from scikits.timeseries.lib.moving_funcs import mov_average_expw
l = [ 3.0, 7.0, 10.0, 20.0, 200.0 ]
y = np.zeros(1000)
y[[l]] = 1
emav = mov_average_expw(y, 199)
plt.plot(emav)
plt.show()
This makes use of mov_average_expw
from scikits.timeseries
. Check that method's documentation to see how I came up with the span parameter based on your code's a
variable.
Upvotes: 0
Reputation: 7592
AFAIK there's not a very good way to do this with numpy
or the scipy.sparse
module -- the sparse matrices in scipy.sparse
are designed to be 2D matrices, and to create one in the first place you'd basically need to use the code you've already written in your first loop (i.e., to set all of the nonzero locations in a sparse matrix), with the additional complexity of always having to specify two index values.
As if that's not bad enough, np.convolve
doesn't work with sparse arrays, so you'd still need to write out the computation in your second loop to compute the moving average.
My recommendation, which probably isn't much help if you're looking for a fancy numpy
version, is to fall back on Python's excellent support as a general-purpose language :
import matplotlib.pyplot as plt
a=0.01
l = set([3, 7, 10, 20, 200])
s = np.zeros(1000)
for i in xrange(len(s)):
s[i] = a * int(i-1 in l) + (1-a) * s[i-1]
plt.plot(s)
plt.show()
Here, I've stored the event index values in l, just as you did, but I used a set
to make lookup times O(1) -- though if len(l)
isn't very large, you might even be better off with a plain list or tuple, you'd need to measure it to be sure. Then you can avoid creating the y
array and just rely on Iverson's convention to convert the Boolean value x in y
into an int
. You might not even need the explicit cast, but I find it helpful to be explicit.
Upvotes: 0