user2171391
user2171391

Reputation:

Plot using pandas

I have some event times in a list and I would like to plot an exponentially weighted moving average of them. I can do this using the following code.

import numpy as np
import matplotlib.pyplot as plt

print "Code runnning"
a=0.01
l = [3.0,7.0,10.0,20.0,200.0]
y = np.zeros(1000)
for item in l:
        y[item]=1
s = np.zeros(1000)
x = np.linspace(0,1000,1000)
for i in xrange(1000):
    s[i] = a*y[i-1]+(1-a)*s[i-1]
plt.plot(x, s)
plt.show()

This is clearly a horrible way to use python however. What's the right way to do this? Is it possible to do it without making all these extra sparse arrays?

The output should look like this.

enter image description here

Upvotes: 2

Views: 307

Answers (3)

elyase
elyase

Reputation: 40973

Pandas comes to mind for this task:

import pandas as pd

l = [3.0,7.0,10.0,20.0,200.0]
s = pd.Series(np.ones_like(l), index=l)
y = s.reindex(range(1000), fill_value=0)
pd.ewma(y, 199).plot()

The period 199 is related to your parameter alpha 0.01 as n=2/(a+1). Result: enter image description here

Upvotes: 1

Joseph Dunn
Joseph Dunn

Reputation: 1298

I think you're looking for something like this:

import numpy as np
import matplotlib.pyplot as plt
from scikits.timeseries.lib.moving_funcs import mov_average_expw

l = [ 3.0, 7.0, 10.0, 20.0, 200.0 ]
y = np.zeros(1000)
y[[l]] = 1
emav = mov_average_expw(y, 199)
plt.plot(emav)
plt.show()

This makes use of mov_average_expw from scikits.timeseries. Check that method's documentation to see how I came up with the span parameter based on your code's a variable.

Upvotes: 0

lmjohns3
lmjohns3

Reputation: 7592

AFAIK there's not a very good way to do this with numpy or the scipy.sparse module -- the sparse matrices in scipy.sparse are designed to be 2D matrices, and to create one in the first place you'd basically need to use the code you've already written in your first loop (i.e., to set all of the nonzero locations in a sparse matrix), with the additional complexity of always having to specify two index values.

As if that's not bad enough, np.convolve doesn't work with sparse arrays, so you'd still need to write out the computation in your second loop to compute the moving average.

My recommendation, which probably isn't much help if you're looking for a fancy numpy version, is to fall back on Python's excellent support as a general-purpose language :

import matplotlib.pyplot as plt

a=0.01
l = set([3, 7, 10, 20, 200])
s = np.zeros(1000)
for i in xrange(len(s)):
    s[i] = a * int(i-1 in l) + (1-a) * s[i-1]
plt.plot(s)
plt.show()

Here, I've stored the event index values in l, just as you did, but I used a set to make lookup times O(1) -- though if len(l) isn't very large, you might even be better off with a plain list or tuple, you'd need to measure it to be sure. Then you can avoid creating the y array and just rely on Iverson's convention to convert the Boolean value x in y into an int. You might not even need the explicit cast, but I find it helpful to be explicit.

Upvotes: 0

Related Questions