Ghoul Fool
Ghoul Fool

Reputation: 6955

Simple rolling average - first few values

I've written a simple rolling average function, which works well. I also don't want to use external libraries like numpy or pandas, just so you know.

def get_rolling_average(data, period):

  rolling = []
  for i in range (0, len(data)):

    end = i + period
    nums = data[i:end]

   # if i < (period-1):
   # nums = data[0:i+1]
   # rolling.append(mean(nums))

    if len(nums) == period:
      rolling.append(mean(nums))

  return rolling

def round_nicely(num, places):
  return round(num, places)

def mean(lst):
  summ = sum(lst[0:len(lst)])
  summ = float(summ)
  return round_nicely(summ/len(lst),1)


print("Rolling average!")

xl = [45, 51, 73, 82, 76, 56, 57, 78, 89, 59]
print get_rolling_average(xl, 3)

With the results being

Rolling average!
[56.3, 68.7, 77.0, 71.3, 63.0, 63.7, 74.7, 75.3]

However, I want to include the first few values if they are smaller than the period. In this exmple, it'll be just 45 & 48.

Rolling average!
[45.0, 48.0, 56.3, 68.7, 77.0, 71.3, 63.0, 63.7, 74.7, 75.3]

 where
 (45)/1 = 45.0
 (45 + 51)/2 = 48.0

I'm not sure the most Pythonic method to do this. I've got a bit of a brain-freeze and my most cohesive attempt is the three lines commented out, but it skips a value.

Upvotes: 1

Views: 76

Answers (2)

jpp
jpp

Reputation: 164753

You were close. Try modifying your existing function as below.

def get_rolling_average(data, period):

  rolling = []

  for i in range (0, len(data)):

    nums = data[i-period+1:i+1]

    if i < period-1:
      rolling.append(mean(data[:i+1]))

    if (i >= period-1) and (len(nums) == period):
      rolling.append(mean(nums))

  return rolling

Returns:

[45.0, 48.0, 56.3, 68.7, 77.0, 71.3, 63.0, 63.7, 74.7, 75.3]

Explanation

  • You want specific logic for i < period-1 versus i >= period-1. Structure your logic in this way.
  • Define nums = data[i-period:i] to enable you to capture all the groupings of 3 values.

Once you are happy with this solution, you may wish to understand alternative implementations, e.g. itertools, numpy, pandas.

Upvotes: 1

AChampion
AChampion

Reputation: 30268

One way to do this would be to use itertools to chain a number of sentinel values to a 3 way tee of the original list, e.g.:

In []:
xl = [45, 51, 73, 82, 76, 56, 57, 78, 89, 59]
ts = [it.chain([0]*c, t) for c, t in enumerate(it.tee(xl, 3))]
[sum(x)/sum(1 for i in x if i) for x in it.zip_longest(*ts, fillvalue=0)]

Out[]:
[45.0,
 48.0,
 56.333333333333336,
 68.66666666666667,
 77.0,
 71.33333333333333,
 63.0,
 63.666666666666664,
 74.66666666666667,
 75.33333333333333,
 74.0,
 59.0]

If 0 is a valid value in the list then you can use another sentinel and explicitly filter it out.

Alternatively, you can use collections.deque with a maxlen=3, e.g.:

In []:
from collections import deque

d = deque(maxlen=3)
r = []
for x in xl:
    d.append(x)
    r.append(sum(d)/len(d))
for _ in range(len(d)-1):
    d.popleft()
    r.append(sum(d)/len(d))
r

Out[]:
[45.0,
 48.0,
 56.333333333333336,
 68.66666666666667,
 77.0,
 71.33333333333333,
 63.0,
 63.666666666666664,
 74.66666666666667,
 75.33333333333333,
 74.0,
 59.0]

Upvotes: 0

Related Questions