Reputation: 6955
I've written a simple rolling average function, which works well. I also don't want to use external libraries like numpy or pandas, just so you know.
def get_rolling_average(data, period):
rolling = []
for i in range (0, len(data)):
end = i + period
nums = data[i:end]
# if i < (period-1):
# nums = data[0:i+1]
# rolling.append(mean(nums))
if len(nums) == period:
rolling.append(mean(nums))
return rolling
def round_nicely(num, places):
return round(num, places)
def mean(lst):
summ = sum(lst[0:len(lst)])
summ = float(summ)
return round_nicely(summ/len(lst),1)
print("Rolling average!")
xl = [45, 51, 73, 82, 76, 56, 57, 78, 89, 59]
print get_rolling_average(xl, 3)
With the results being
Rolling average!
[56.3, 68.7, 77.0, 71.3, 63.0, 63.7, 74.7, 75.3]
However, I want to include the first few values if they are smaller than the period. In this exmple, it'll be just 45 & 48.
Rolling average!
[45.0, 48.0, 56.3, 68.7, 77.0, 71.3, 63.0, 63.7, 74.7, 75.3]
where
(45)/1 = 45.0
(45 + 51)/2 = 48.0
I'm not sure the most Pythonic method to do this. I've got a bit of a brain-freeze and my most cohesive attempt is the three lines commented out, but it skips a value.
Upvotes: 1
Views: 76
Reputation: 164753
You were close. Try modifying your existing function as below.
def get_rolling_average(data, period):
rolling = []
for i in range (0, len(data)):
nums = data[i-period+1:i+1]
if i < period-1:
rolling.append(mean(data[:i+1]))
if (i >= period-1) and (len(nums) == period):
rolling.append(mean(nums))
return rolling
Returns:
[45.0, 48.0, 56.3, 68.7, 77.0, 71.3, 63.0, 63.7, 74.7, 75.3]
Explanation
i < period-1
versus i >= period-1
. Structure your logic in this way.nums = data[i-period:i]
to enable you to capture all the groupings of 3 values.Once you are happy with this solution, you may wish to understand alternative implementations, e.g. itertools
, numpy
, pandas
.
Upvotes: 1
Reputation: 30268
One way to do this would be to use itertools
to chain
a number of sentinel values to a 3 way tee
of the original list, e.g.:
In []:
xl = [45, 51, 73, 82, 76, 56, 57, 78, 89, 59]
ts = [it.chain([0]*c, t) for c, t in enumerate(it.tee(xl, 3))]
[sum(x)/sum(1 for i in x if i) for x in it.zip_longest(*ts, fillvalue=0)]
Out[]:
[45.0,
48.0,
56.333333333333336,
68.66666666666667,
77.0,
71.33333333333333,
63.0,
63.666666666666664,
74.66666666666667,
75.33333333333333,
74.0,
59.0]
If 0
is a valid value in the list then you can use another sentinel and explicitly filter
it out.
Alternatively, you can use collections.deque
with a maxlen=3
, e.g.:
In []:
from collections import deque
d = deque(maxlen=3)
r = []
for x in xl:
d.append(x)
r.append(sum(d)/len(d))
for _ in range(len(d)-1):
d.popleft()
r.append(sum(d)/len(d))
r
Out[]:
[45.0,
48.0,
56.333333333333336,
68.66666666666667,
77.0,
71.33333333333333,
63.0,
63.666666666666664,
74.66666666666667,
75.33333333333333,
74.0,
59.0]
Upvotes: 0